"Dann Corbit" <[EMAIL PROTECTED]> writes: > There seems to be something seriously defective with hash indexes in old > versions of PostgreSQL.
They still suck; I'm not aware of any situation where I'd recommend hash over btree indexes in Postgres. I think we have fixed the hash indexes' deadlock problems as of 7.4, but there's still no real performance advantage. I just had an epiphany as to the probable reason why the performance sucks. It's this: the hash bucket size is the same as the page size (ie, 8K). This means that if you have only one or a few items per bucket, the information density is awful, and you lose big on I/O requirements compared to a btree index. On the other hand, if you have enough items per bucket to make the storage density competitive, you will be doing linear searches through dozens if not hundreds of items that are all in the same bucket, and you lose on CPU time (compared to btree which can do binary search to find an item within a page). It would probably be interesting to look into making the hash bucket size be just a fraction of a page, with the intent of having no more than a couple dozen items per bucket. I'm not sure what the implications would be for intra-page storage management or index locking conventions, but offhand it seems like there wouldn't be any insurmountable problems. I'm not planning on doing this myself, just throwing it out as a possible TODO item for anyone who's convinced that hash indexes ought to work better than they do. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings