Luc Vlaming <l...@swarm64.com> writes:
> Given the testcase we see that the outer semi join tries to join the 
> outer with the inner table id columns, even though the middle table id 
> column is also there. Is this expected behavior?

I don't see anything greatly wrong with it.  The planner has concluded
that _inner.id2 and middle.id1 are part of an equivalence class, so it
can form the top-level join by equating _outer.id3 to either of them.
AFAIR that choice is made at random --- there's certainly not any logic
that thinks about "well, the intermediate join output could be a bit
narrower if we choose this one instead of that one".

I think "made at random" actually boils down to "take the first usable
member of the equivalence class".  If I switch around the wording of
the first equality condition:

   ... select 1 from _inner where middle.id1 = _inner.id2

then I get a plan where the top join uses middle.id1.  However,
it's still propagating both middle.id1 and _inner.id2 up through
the bottom join, so that isn't buying anything efficiency-wise.

                        regards, tom lane


Reply via email to