Luc Vlaming <l...@swarm64.com> writes: > Given the testcase we see that the outer semi join tries to join the > outer with the inner table id columns, even though the middle table id > column is also there. Is this expected behavior?
I don't see anything greatly wrong with it. The planner has concluded that _inner.id2 and middle.id1 are part of an equivalence class, so it can form the top-level join by equating _outer.id3 to either of them. AFAIR that choice is made at random --- there's certainly not any logic that thinks about "well, the intermediate join output could be a bit narrower if we choose this one instead of that one". I think "made at random" actually boils down to "take the first usable member of the equivalence class". If I switch around the wording of the first equality condition: ... select 1 from _inner where middle.id1 = _inner.id2 then I get a plan where the top join uses middle.id1. However, it's still propagating both middle.id1 and _inner.id2 up through the bottom join, so that isn't buying anything efficiency-wise. regards, tom lane