29 October 2008

PHP Namespaces

Succinctly put. I'm a fan of worse is better, but this is pushing it way too much. PHP's like a bunch of infinite monkeys whacking on keyboards, alternatively coming up with brilliant stuff like SimpleXML, Wordpress and Drupal, or brillant [sic] stuff like the namespaces fiasco (":::" is "hard to parse/easy to be confused"), and magic_quotes and register_globals. Utterly tasteless decisions.

10 September 2008

FOSS.my in the planning.

Overdue for a FOSS event:

http://foss.org.my/projects/events/foss.my/FOSS.my

Tentative 8 Nov 2008.

Will have an IRC meet on it, probably #myoss, freenode, evening 16 Sep 2008. Everyone should join in if possible, even to lurk.

See you there.

07 August 2008

nginx "bug"

Encountered some odd drupal+apache2+nginx interaction bug today. Basically, "Transfer-encoding: chunked" was done on the content twice if drupal 404'd. Doesn't seem to trigger outside of drupal, IIRC. Don't know exactly which fault was it, but:

http://www.ruby-forum.com/topic/152435

described it and nginx author Igor Sysoev gave a patch to fix it. His own development version of nginx doesn't seem to have that patch incoorperated, but the patch made the bug go away.

Win. I had hacked nginx .deb for Ubuntu Hardy amd64 at my informal work site.

24 July 2008

Patch your nameservers!

Just patched a number of our nameservers. ( http://www.doxpara.com/ ). Be careful that it's not your content nameservers that matters here, but your own resolver and upstream's nameservers that matters. Check with the tool in www.doxpara.com. Ingenius way to check for vulnerablity, btw.

If you don't trust your upstream's dns, run your own patched nameserver, but don't forward queries upstream, but straight to the root servers.

P.S. Use opendns or our own patched dns: 202.190.85.116 (temporary while upstream patches their's. I'll remove this in a bit)

11 June 2008

Caching queries via functional indices in PostgreSQL.

Straight to the point:


foo=# select count(*) from bar;
count
--------
624569
(1 row)

foo=# explain analyze select count(*) from bar where baz ilike '%some%string%' and quux = '123';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=25846.58..25846.59 rows=1 width=0) (actual time=1543.654..1543.655 rows=1 loops=1)
-> Seq Scan on bar (cost=0.00..25846.53 rows=17 width=0) (actual time=288.667..1543.556 rows=32 loops=1)
Filter: (((baz)::text ~~* '%some%string%'::text) AND (quux = '123'::bpchar))
Total runtime: 1543.798 ms


It takes 1.5 seconds on our sample data to find 32 rows we want out of 624569 rows. If this happens to be a large table, and we often need to run this query, rather than creating some external or internal trigger to cache this query, we can use PostgreSQL's partial indices to do the work for us:


foo=# create index bar_idx_some_string on bar(id) where baz ilike '%some%string%' and quux = '123';
CREATE INDEX
foo=# explain analyze select count(*) from bar where baz ilike '%some%string%' and quux = '123';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=12.71..12.72 rows=1 width=0) (actual time=0.470..0.473 rows=1 loops=1)
-> Index Scan using bar_idx_some_string on bar (cost=0.00..12.67 rows=17 width=0) (actual time=0.122..0.406 rows=32 loops=1)
Total runtime: 0.534 ms


Now it takes 0.0005 seconds.

This also works if in a single transaction or query session, you need to run a number of queries against a table with the similar conditions. Create the index first (give it a temporary name), run the queries, then drop the index.

Caveats:
Creating the index takes up about same time as the original query, so you need to reuse the conditions to get back your initial investment.

Adding or updating to the table takes up a teeny more time. Works best on infrequently updated tables.

Kind of useless if the index condition matches a large percentage of the table.

You can use some rather complication functions and conditions for your index, but they must be immutable and must not use external info (other tables, time of day).

CREATE INDEX locks the table against writes. Use CREATE INDEX CONCURRENTLY on live systems. It comes with it own caveats though.

Docs: CREATE INDEX