Robert Haas <robertmh...@gmail.com> writes: > On Mon, Jan 15, 2018 at 12:31 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >> Hm. Simple is certainly good, but if there's multiple rows coming >> back during an EPQ recheck then I think we have a performance problem.
> You are correct ... I was wrong about that part, and said so in an > email on this thread sent about 45 minutes before yours. However, I > still think the patch is a good fix for the immediate issue, unless > you see some problem with it. It's simple and back-patchable and does > not preclude further work anybody, including you, might want to do in > the future. I spent some more time poking at this. I confirmed that there is *not* a performance problem: as I'd thought, during an EPQ recheck each base scan relation will return its single EPQ tuple, without any communication with the remote server, whether or not the rel was marked FOR UPDATE. There *is* a problem with GetExistingLocalJoinPath not honoring its API spec: it will sometimes return join nests that include remote joins at levels below the top, as I'd speculated to begin with. This turns out not to be a problem for postgres_fdw, because any such remote-join node will just recursively fob off EPQ checks onto its own outerpath, so that eventually you get down to base scan relations anyway, and everything works. However, because GetExistingLocalJoinPath is claimed to be a general purpose function useful for other FDWs, the fact that its API contract is a lie seems to me to be a problem anyway. Another FDW might have a stronger dependency on there not being remote sub-joins. (Reviewing the thread history, I see that Ashutosh agreed this was possible at <CAFjFpRfYETZLiKERpxCeAN1MsiJypsyNd6x9FQ7OWjY=_tg...@mail.gmail.com> and suggested that we just change the function's commentary to not make promises it won't keep. Maybe that's enough, but I think just walking away isn't enough.) I'm also still pretty unhappy with the amount of useless planning work caused by doing GetExistingLocalJoinPath during path creation. It strikes me that we could likely replace the entire thing with some code that just reconstructs the join node's output tuple during EPQ using the rowmark data for all the base relations. Outer joins aren't really a problem: we could tell which relations were replaced by nulls because the rowmark values that bubbled up to the top went to nulls themselves. However, that's a nontrivial amount of work and probably wouldn't result in something we cared to back-patch, especially since it's not really a bug fix. Anyway, I agree with you that we know of no observable bug here except for Jeff's original complaint, and forcing sorting of the EPQ paths would be enough to fix that. Your patch seems OK codewise, but I think the comment is not nearly adequate. Maybe something like "The EPQ path must be at least as well sorted as the path itself, in case it gets used as input to a mergejoin." Also, a regression test case would be appropriate perhaps. I tried to reproduce Jeff's complaint in the postgres_fdw regression database, and it actually failed on the very first multi-way join I tried: contrib_regression=# explain select * from ft1,ft2,ft4,ft5 where ft1.c1=ft2.c1 and ft1.c1=ft4.c1 and ft1.c1=ft5.c1 for update; ERROR: outer pathkeys do not match mergeclauses Also, for the archives' sake, here's an example showing that GetExistingLocalJoinPath is not doing what it says on the tin: contrib_regression=# set from_collapse_limit TO 1; SET contrib_regression=# set enable_hashjoin TO 0; SET contrib_regression=# set enable_mergejoin TO 0; SET contrib_regression=# explain select ft1.c1,ft4.* from ft1,ft2 left join ft4 on ft2.c1=ft4.c1,ft5,loc1 where ft1.c1=ft2.c1 and ft1.c1=ft5.c1 and ft1.c1=loc1.f1 for update of ft1,loc1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- LockRows (cost=105.72..781.42 rows=210 width=286) -> Nested Loop (cost=105.72..779.32 rows=210 width=286) Join Filter: (ft2.c1 = loc1.f1) -> Seq Scan on loc1 (cost=0.00..22.70 rows=1270 width=10) -> Materialize (cost=105.72..128.06 rows=33 width=179) -> Foreign Scan (cost=105.72..127.89 rows=33 width=179) Relations: ((public.ft1) INNER JOIN ((public.ft2) LEFT JOIN (public.ft4))) INNER JOIN (public.ft5) -> Nested Loop (cost=233.62..721.53 rows=136 width=179) Join Filter: (ft2.c1 = ft5.c1) -> Nested Loop (cost=202.12..12631.43 rows=822 width=137) Join Filter: (ft1.c1 = ft2.c1) -> Foreign Scan on ft1 (cost=100.00..141.00 rows=1000 width=75) -> Materialize (cost=102.12..162.48 rows=822 width=83) -> Foreign Scan (cost=102.12..158.37 rows=822 width=83) Relations: (public.ft2) LEFT JOIN (public.ft4) -> Nested Loop Left Join (cost=200.00..856.78 rows=822 width=83) Join Filter: (ft2.c1 = ft4.c1) -> Foreign Scan on ft2 (cost=100.00..137.66 rows=822 width=54) -> Materialize (cost=100.00..102.75 rows=50 width=54) -> Foreign Scan on ft4 (cost=100.00..102.50 rows=50 width=54) -> Materialize (cost=100.00..102.16 rows=33 width=43) -> Foreign Scan on ft5 (cost=100.00..101.99 rows=33 width=43) (22 rows) I don't think this latter example needs to become a regression test in this form. If we wanted to gear up an isolation test that actually exercised EPQ with postgres_fdw, maybe it'd be useful in that context, to prove that nested remote joins work for EPQ. regards, tom lane