Robert Haas <robertmh...@gmail.com> writes: > But have a look at this: > http://postgr.es/m/561e12d4.7040...@lab.ntt.co.jp > That shows a case where, before > 5fc4c26db5120bd90348b6ee3101fcddfdf54800, a query that required the > foreign table to do an EPQ recheck produced an unambiguously wrong > answer; the query stipulates that inttab.a = ft1.a but the row > returned lacks that property. I think this clearly shows that EPQ > rechecks are needed at least for FDW paths that are parameterized.
Certainly; that's what I said before. > However, I guess that doesn't actually refute your point, which IIUC > is that maybe we don't need these checks for FDW join paths, because > we don't build parameterized paths for those yet. Now I think it would > still be a good idea to have a way of handling that case, because > somebody's likely going want to fix that /* XXX Consider parameterized > paths for the join relation */ eventually, but I guess for the > back-branches just setting epq_path = NULL instead of calling > GetExistingLocalJoinPath() might be the way to go... assuming that > your contention that this won't break anything is indeed correct. I thought some more about this. While it seems clear that we don't actually need to recheck anything within a postgres_fdw foreign join, there's still a problem: we have to be able to regurgitate the join row during postgresRecheckForeignScan. Since, in the current system design, the only available data is scan-level rowmarks (that is, whole-row values from each base foreign relation), there isn't any very good way to produce the join row except to insert the rowmark values into dummy scan nodes and then re-execute the join locally. So really, that is what the outerPlan infrastructure is doing for us. We could imagine that, instead of the scan-level rowmarks, the foreign join node could produce a ROW() expression containing the values it actually has to output, and then use that as a "join-level rowmark" to reconstitute the join row without any of the outerPlan infrastructure. This would have some pretty significant advantages, notably that we could (in principle) avoid fetching all the columns of remote tables we no longer needed scan-level rowmarks for. However, to do that, we'd have to dynamically decide which rowmark expressions actually need to wind up getting computed in the final join plan, based on which joins we choose to do remotely. The current scheme of statically adding rowmarks to the query during base relation setup doesn't cut it. So that's sounding like a nontrivial rewrite (and one that would break the related FDW APIs, although the patch at hand does that too). As for the question of a future extension to parameterized paths, I think that it would likely work to recheck the parameterized qual at the join node, making it basically just like rechecks for scan-level nodes. The objection to this is that it might not produce the same answer if the parameterized qual references relations that are on the insides of outer joins ... but I think that in practice we could just blow off that case (ie, reject producing parameterized paths involving such quals). The reason is that I think such cases arise only very rarely. It could only happen for non-strict qual expressions, because a strict qual would have resulted in the outer join getting strength reduced to a regular join. Not sure where that leaves us for the patch at hand. It feels to me like it's a temporary band-aid for code that we want to get rid of in the not-too-long run. As such, it'd be okay if it were smaller, but it seems bigger and more invasive than I could wish for a band-aid. regards, tom lane