On Tue, Sep 3, 2024 at 11:31 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > Yeah. I've been mulling over how we could do this, and the real > problem is that the expression containing the PHV *has* been fully > preprocessed by the time we get to outer join strength reduction > (cf. file header comment in prepjointree.c). Simply dropping the PHV > can break various invariants that expression preprocessing is supposed > to establish, such as "no RelabelType directly above another" or "no > AND clause directly above another".
Yeah, this is what the problem is. > I haven't thought of a reliable > way to fix that short of re-running eval_const_expressions afterwards, > which seems like a mighty expensive answer. This seems likely to result in a lot of duplicate work. > We could try to make > remove_nulling_relids_mutator preserve all these invariants, but > keeping it in sync with what eval_const_expressions does seems like > a maintenance nightmare. Yeah, and it seems that we also need to know the EXPRKIND code for the expression containing the to-be-dropped PHV in remove_nulling_relids to know how we should preserve all these invariants, which seems to require a lot of code changes. Looking at the routines run in preprocess_expression, most of them recurse into PlaceHolderVars and preprocess the contained expressions. The two exceptions are canonicalize_qual and make_ands_implicit. I wonder if we can modify them to also recurse into PlaceHolderVars. Will this resolve our problem here? > I believe this is an area worth working on. I've been wondering > whether it'd be better to handle the expression-identity problems by > introducing an "expression wrapper" node type that is distinct from > PHV and has the sole purpose of demarcating a subexpression we don't > want to be folded into its parent. I think that doesn't really move > the football in terms of fixing the problems mentioned above, but > perhaps it's conceptually cleaner than adding a bool field to PHV. > > Another thought is that grouping sets provide one of the big reasons > why we need an "expression wrapper" or equivalent functionality. > So maybe we should try to move forward on your other patch to change > the representation of those before we spend too much effort here. Hmm, in the case of grouping sets, we have a similar situation where we do not want a subexpression that is part of grouping items to be folded into its parent, because otherwise we may fail to match these subexpressions to lower target items. For grouping sets, this problem is much easier to resolve, because we've already replaced such subexpressions with Vars referencing the GROUP RTE. As a result, during expression preprocessing, these subexpressions will not be folded into their parents. An ensuing effect of this approach is that a HAVING clause may contain expressions that are not fully preprocessed if they are part of grouping items. This is not an issue as long as the clause remains in HAVING, because these expressions will be matched to lower target items in setrefs.c. However, if the clause is moved or copied into WHERE, we need to re-preprocess these expressions. This should not be too expensive, as we only need to re-preprocess the HAVING clauses that are moved to WHERE, not the entire tree. The v13 patch in that thread implements this approach. Thanks Richard