On Wed, Aug 11, 2021 at 10:30 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > Robert Haas <robertmh...@gmail.com> writes: > > I suspect it would be hard to store multiple hash values, one per > > column. It seems to me that what we ought to do is combine the hash > > values for the individual columns using hash_combine(64) and store the > > combined value. I can't really imagine why we would NOT do that. > > That would make it impossible to use the index except with queries > that provide equality conditions on all the index columns. Maybe > that's fine, but it seems less flexible than other possible definitions. > It really makes me wonder why anyone would bother with a multicol > hash index.
Hmm. That is a point I hadn't considered. I have to admit that after working with Amit on all the work to make hash indexes WAL-logged a few years ago, I was somewhat disillusioned with the whole AM. It seems like a cool idea to me but it's just not that well-implemented. For example, the strategy of just doubling the number of buckets in one shot seems pretty terrible for large indexes, and ea69a0dead5128c421140dc53fac165ba4af8520 will buy only a limited amount of relief. Likewise, the fact that keys are stored in hash value order within pages but that the bucket as a whole is not kept in order seems like it's bad for search performance and really bad for implementing unique indexes with reasonable amounts of locking. (I don't know how the present patch tries to solve that problem.) It's tempting to think that we should think about creating something altogether new instead of hacking on the existing implementation, but that's a lot of work and I'm not sure what specific design would be best. -- Robert Haas EDB: http://www.enterprisedb.com