On Sat, Sep 24, 2016 at 1:03 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > On Sat, Sep 24, 2016 at 1:02 AM, Robert Haas <robertmh...@gmail.com> wrote: >> Currently, hash indexes always store the hash code in the index, but >> not the actual Datum. It's recently been noted that this can make a >> hash index smaller than the corresponding btree index would be if the >> column is wide. However, if the index is being built on a fixed-width >> column with a typlen <= sizeof(Datum), we could store the original >> value in the hash index rather than the hash code without using any >> more space. That would complicate the code, but I bet it would be >> faster: we wouldn't need to set xs_recheck, we could rule out hash >> collisions without visiting the heap, and we could support index-only >> scans in such cases. > > What exactly you mean by Datum? Is it for datatypes that fits into 64 > bits like integer.
Yeah, I mean whatever is small enough to fit into the space currently being used to store the hashcode, along with any accompanying padding bytes that we can also use. > I think if we are able to support index only scans > for hash indexes for some data types, that will be a huge plus. > Surely, there is some benefit without index only scans as well, which > is we can avoid recheck, but not sure if that alone can give us any > big performance boost. As, you say, it might lead to some > complication in code, but I think it is worth trying. Yeah, the recheck is probably not that expensive if we have to retrieve the heap page anyway. > Won't it add some requirements for pg_upgrade as well? I have nothing to add to what Bruce already said. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers