On Thu, Aug 3, 2017 at 6:47 PM, Robert Haas <robertmh...@gmail.com> wrote: > That seems pretty lame, although it's sufficient to solve the > immediate problem, and I have to admit to a certain predilection for > things that solve the immediate problem without creating lots of > additional work.
After some further thought, I propose the following approach to the issues raised on this thread: 1. Allow hash functions to have a second, optional support function, similar to what we did for btree opclasses in c6e3ac11b60ac4a8942ab964252d51c1c0bd8845. The second function will have a signature of (opclass_datatype, int64) and should return int64. The int64 argument is a salt. When the salt is 0, the low 32 bits of the return value should match what the existing hash support function returns. Otherwise, the salt should be used to perturb the hash calculation. This design kills two birds with one stone: it gives callers a way to get 64-bit hash values if they want them (which should make Tom happy, and we could later think about plugging it into hash indexes) and it gives us a way of turning a single hash function into many (which should allow us to prevent hash indexes or hash tables built on a hash-partitioned table from having a heavily lopsided distribution, and probably will also make people who are interested in topics like Bloom filters happy). 2. Introduce a new hash opfamilies here which are more faster, more portable, and/or better in other ways than the ones we have today. Given our current rather simplistic notion of a "default" opclass, there doesn't seem to be an easy to make whatever we introduce here the default for hash partitioning while keeping the existing default for other purposes. That should probably be fixed at some point. However, given the amount of debate this topic has generated, it also doesn't seem likely that we'd actually wish to decide on a different default in the v11 release cycle, so I don't think there's any rush to figure out exactly how we want to fix it. Focusing on introducing the new opfamilies at all is probably a better use of time, IMHO. Unless anybody strongly objects, I'm going to write a patch for #1 (or convince somebody else to do it) and leave #2 for someone else to tackle if they wish. In addition, I'll tackle (or convince someone else to tackle) the project of adding that second optional support function to every hash opclass in the core repository. Then Amul can update the core hash partitioning patch to use the new infrastructure when it's available and fall back to the existing method when it's not, and I think we'll be in reasonably good shape. Objections to this plan of attack? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers