David Rowley <david.row...@2ndquadrant.com> writes: > [ unique_joins_2016-04-07.patch ]
Just had a thought about this, which should have crystallized a long time ago perhaps. Where I'd originally imagined you were going with this idea is to do what the thread title actually says, and check for joins in which the *outer* side is unique. I can't see that that's of any value for nestloop or hash joins, but for merge joins, knowing that the outer side is unique could be extremely valuable because we could skip doing mark/restore backups on the inner side, hugely reducing the cost when the inner side has many duplicates. Now I'm not asking for the initially-committed version of the patch to do that, but I think we need to design it to be readily extensible to do so. The problem with this is that it blows the current factorization around add_paths_to_joinrel out of the water. What we'd want is for the caller (make_join_rel) to determine uniqueness on both sides, and pass that info down to each of its two calls of add_paths_to_joinrel; otherwise we have to do double the work because each run of add_paths_to_joinrel will have to make those same two determinations. This probably also means that encoding the uniqueness into JoinType is a lost cause. Expanding JOIN_INNER into four variants depending on whether either or both sides are known unique, and ditto for JOIN_LEFT, doesn't seem attractive at all. I suspect we want to go back to your original design with a separate bool flag (well, two bools now, but anyway separate from JoinType). Or maybe the variant JoinTypes still are better, since they'd fit into switch() tests more naturally, but it's a lot more debatable as to whether that notation is a win. I apologize for leading you down the wrong path on the notational aspect, but sometimes the right answer isn't clear till you've tried all the possibilities. Anyway, while refactoring the make_join_rel/add_paths_to_joinrel division of labor wouldn't be such a big deal in itself, I don't want to commit a change to JoinType only to undo it later; that would be too much churn. So I think we need to resolve this question before we can move forward. I don't know if you have time to look at this now --- my clock says it's already Friday morning in New Zealand. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers