Re: [HACKERS] Multicolumn hash indexes

Robert Haas Wed, 27 Sep 2017 09:00:08 -0700

On Wed, Sep 27, 2017 at 11:57 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Robert Haas <robertmh...@gmail.com> writes:
>> On Wed, Sep 27, 2017 at 9:56 AM, Jesper Pedersen
>> <jesper.peder...@redhat.com> wrote:
>>> Maybe an initial proof-of-concept could store the hash of the first column
>>> (col1) plus the hash of all columns (col1, col2, col3) in the index, and see
>>> what requirements / design decisions would appear from that.
>
>> I thought about that sort of thing yesterday but it's not that simple.
>> The problem is that the hash code isn't just stored; it's used to
>> assign tuples to buckets.  If you have two hash codes, you have to
>> pick one of the other to use for assigning the tuple to a bucket.  And
>> then if you want to search using the other hash code, you have to
>> search all of the buckets, which will stink.
>
> If we follow GIST's lead that the leading column is "most important",
> the idea could be to require a search constraint on the first column,
> which produces the hash that determines the bucket assignment.  Hashes
> for additional columns would just be payload data in the index entry.
> If you have search constraint(s) on low-order column(s), you can check
> for hash matches before visiting the heap, but they don't reduce how
> much of the index you have to search.  Even btree works that way for
> many combinations of incomplete index constraints.


I see.  That makes sense.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multicolumn hash indexes

Reply via email to