On 2024-12-08 15:44:23 +0700, Andrei Lepikhov wrote: > On 8/12/2024 09:52, Andres Freund wrote: > > > I think avoiding touching a hash table and an index under MergeJoin can > > > also > > > be beneficial. > > > > How would you get significant wins for mergejoins? You need to go through > > both > > inner and outer anyway?
> In my mind, this trick can be designed for specific cases like sales tables, > as illustrated before and used by well-rounded developers. I'm not saying that the optimization isn't useful, just that i don't think it makes much sense for mergeoins. Besides, at least as far as I can see, fundamentally not optimizing mergejoins in any substantial manner, it also just doesn't seem likely that queries where this optimization would come up are likely to be planned as mergejoins. If you have a leftjoin to a category-style table, you're very rarely going to scan the "main" table with an ordered index scan on the category key, which would be required for a merge join. And sorting the main table once for each to-be-joined-category-table isn't a desirable path most of the time either. I don't know what it means that it's to be "used by well-rounded developers". We have to write the implementation in way it works regardless of what kind of developer is using postgres. > I'm not sure that such optimisation would be profitable in general. Are you suggesting it would only be enabled by a GUC? > My point is that the sales database has lots of categories, and when > requesting product descriptions, we will not necessarily touch all the > categories - in that case, the one-sided clause could allow us to avoid > scanning some tables at all. Am I wrong? No, I don't think you're wrong. I just don't think it's worthwhile for mergejoins, because I don't see relevant query patterns where it would help. > BTW, may it be used in SEMI JOIN cases? Seems possible. Greetings, Andres Freund