On Fri, Sep 29, 2017 at 10:54 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > There are few if any indexing techniques where the first column isn't > significantly more important than the rest --- certainly that's true > for btree, for example.
Well, BRIN doesn't care. > I do not think it's a showstopper if that's > true for hash as well. Certainly true, but on the other hand, if the design for which you are advocating doesn't address a problem Tomasz has, he probably won't want to bother implementing it. I think it's pretty interesting to think about what other index designs we might have that using hashing but are not the same as our existing hash indexes. Reviewing Amit's patches for hash indexes has helped me to understand how the hash AM works a lot better than I had before, and it's really a pretty limiting design. The buckets aren't sorted by hashcode overall, just within a page, which sucks, and the way that storage allocations are managed within the index is painful. It might be worth thinking about what we might create for a hash index today if we were starting over from scratch. Andres proposed a hash-over-btree thing where you build a hash value for each tuple and then sort them by hash code and arrange the results into a btree; I think that would have the same multi-column index problems you're talking about here, but it would be more space-efficient. Now you could also imagine something where we keep a separate set of hash buckets for each column and multi-column searches are handled by searching each hash table separately and taking the intersection of the sets of TIDs found for each column. Not sure that's a good idea, but it's sort of like what GIN does. There are probably other ideas, too. We should investigate what others have done. I see that https://docs.microsoft.com/en-us/sql/relational-databases/in-memory-oltp/hash-indexes-for-memory-optimized-tables section E.1 claims that SQL Server requires a hash index to include an equality condition on every column in the index. I still think that's an appealing option from a theoretical point of view even if the implementation difficulties are formidable. Come to think of it, I think that the planner behavior you describe whereby we put a premium on throwing away unnecessary conditions is probably not so great in other cases, too. In a btree index, we have to step through all the tuples in the range covered by the index scan one by one; if the join sitting above the index scan will reject those tuples anyway when it tests the join qual, I can see that doing the same test redundantly at the scan level could well be a loser. But suppose it's a BRIN index. Now doing the test at the scan level is a lot more likely to win -- I think -- because BRIN can perform that test once per page rather than once per tuple, whereas the join will have to reject individual tuples. This sounds very similar to the question of whether we ought to infer implied inequalities, which as you know has been a regular complaint. It's certainly the case that enforcing an inequality in multiple places can be inefficient, but it can also be a tremendous win if the inequality is highly selective, which is not uncommon (e.g. imagine a sort-and-merge-join between two tables and an inequality on the join column that knocks out 90% - or 99% - of the rows in each). I have been wondering whether we need a concept of an optional qual. Instead of dropping a qual that doesn't need to be enforced, move it to a separate bucket of quals that can be enforced if it looks like a win from a cost perspective and otherwise ignored without breaking anything. Implied inequalities (and maybe implied equalities) could be thrown into such a bucket as well, allowing the decision about whether to enforce them to be based on some algorithm or heuristic rather than a summary judgement. Or maybe that's not the right concept at all, but I think we need some idea about how cases like this can be handled. People want the database to handle the cases about which they care, not judge which cases those are. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers