On Mon, 27 Sept 2021 at 06:52, Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Thu, Sep 23, 2021 at 11:11 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > On Thu, Sep 23, 2021 at 10:04 AM Sadhuprasad Patro <b.sa...@gmail.com> > > wrote: > > > > > > And to get the multi-column hash index selected, we may set > > > enable_hashjoin =off, to avoid any condition become join condition, > > > saw similar behaviors in other DBs as well... > > > > This may be related to Tom's point that, if some of the quals are > > removed due to optimization or converted to join quals, then now, even > > if the user has given qual on all the key columns the index scan will > > not be selected because we will be forcing that the hash index can > > only be selected if it has quals on all the key attributes? > > > > I don't think suggesting enable_hashjoin =off is a solution, > > > > Yeah, this doesn't sound like a good idea. How about instead try to > explore the idea where the hash (bucket assignment and search) will be > based on the first index key and the other columns will be stored as > payload? I think this might pose some difficulty in the consecutive > patch to enable a unique index because it will increase the chance of > traversing more buckets for uniqueness checks. If we see such > problems, then I have another idea to minimize the number of buckets > that we need to lock during uniqueness check which is by lock chaining > as is used during hash bucket clean up where at a time we don't need > to lock more than two buckets at a time.
I have presented a simple, almost trivial, patch to allow multi-col hash indexes. It hashes the first column only, which can be a downside in *some* cases. If that is clearly documented, it would not cause many issues, IMHO. However, it does not have any optimization issues or complexities, which is surely a very good thing. Trying to involve *all* columns in the hash index is a secondary optimization. It requires subtle changes in optimizer code, as Tom points out. It also needs fine tuning to make the all-column approach beneficial for the additional cases without losing against what the "first column" approach gives. I did consider both approaches and after this discussion I am still in favour of committing the very simple "first column" approach to multi-col hash indexes now. -- Simon Riggs http://www.EnterpriseDB.com/