Hi Richard Thank you work on this - * - * XXX this could be enabled if the remaining join quals were made part of - * the inner scan's filter instead of the join filter. Maybe it's worth - * considering doing that? */ - if (extra->inner_unique && - (inner_path->param_info == NULL || - bms_num_members(inner_path->param_info->ppi_serials) < - list_length(extra->restrictlist))) - return NULL; + if (extra->inner_unique) + { + Bitmapset *ppi_serials; + + if (inner_path->param_info == NULL) + return NULL; + + ppi_serials = inner_path->param_info->ppi_serials; + + foreach_node(RestrictInfo, rinfo, extra->restrictlist) + { + if (!bms_is_member(rinfo->rinfo_serial, ppi_serials)) + return NULL; + } + } The refined version introduces a more precise validation by: 1. Explicitly verifying each restriction – Instead of relying on a count comparison, it checks whether every restriction's serial number is included in the inner path’s parameter info (ppi_serials). 2. Reducing false negatives – The optimizer now only discards a path if a restriction is definitively unhandled, allowing more valid plans to be considered.
On Mon, Apr 7, 2025 at 3:50 PM Richard Guo <guofengli...@gmail.com> wrote: > While reviewing another patch [1], I noticed an incorrect check in > get_memoize_path. > > if (extra->inner_unique && > (inner_path->param_info == NULL || > bms_num_members(inner_path->param_info->ppi_serials) < > list_length(extra->restrictlist))) > return NULL; > > This check is to ensure that, for unique joins, all the restriction > clauses are parameterized to enable the use of Memoize nodes. > However, ppi_clauses includes join clauses available from all outer > rels, not just the current outer rel, while extra->restrictlist only > includes the restriction clauses for the current join. This means the > check could pass even if a restriction clause isn't parameterized, as > long as another join clause, which doesn't belong to the current join, > is included in ppi_clauses. It's not very hard to construct a query > that exposes this issue. > > create table t (a int, b int); > > explain (costs off) > select * from t t0 left join > (t t1 left join t t2 on t1.a = t2.a > join lateral > (select distinct on (a, b) a, b, t0.a+t1.a+t2.a as x from t t3 offset > 0) t3 > on t1.a = t3.a and coalesce(t2.b) = t3.b) > on t0.b = t3.b; > QUERY PLAN > --------------------------------------------------------- > Nested Loop Left Join > -> Seq Scan on t t0 > -> Nested Loop > Join Filter: (COALESCE(t2.b) = t3.b) > -> Hash Left Join > Hash Cond: (t1.a = t2.a) > -> Seq Scan on t t1 > -> Hash > -> Seq Scan on t t2 > -> Subquery Scan on t3 > Filter: ((t0.b = t3.b) AND (t1.a = t3.a)) > -> Unique > -> Sort > Sort Key: t3_1.a, t3_1.b > -> Seq Scan on t t3_1 > (15 rows) > > Consider the join to t3. It is a unique join, and not all of its > restriction clauses are parameterized. Despite this, the check still > passes. > > Interestingly, although the check passes, no Memoize nodes are added > on top of t3's paths because paraminfo_get_equal_hashops insists that > all ppi_clauses must satisfy clause_sides_match_join (though I > actually question whether this is necessary, but that's a separate > issue). However, this does not justify the check being correct. > > Hence, I propose the attached patch for the fix. > > BTW, there is a XXX comment there saying that maybe we can make the > remaining join quals part of the inner scan's filter instead of the > join filter. I don't think this is possible in all cases. In the > above query, 'coalesce(t2.b) = t3.b' cannot be made part of t3's scan > filter, according to join_clause_is_movable_into. So I removed that > comment in the patch while we're here. > > Any thoughts? > > [1] https://postgr.es/m/60bf8d26-7c7e-4915-b544-afdb90200...@gmail.com > > Thanks > Richard >