On Fri, Oct 2, 2020 at 7:07 PM James Coleman <jtc...@gmail.com> wrote: > > On Fri, Oct 2, 2020 at 6:28 PM Tomas Vondra > <tomas.von...@2ndquadrant.com> wrote: > > > > On Fri, Oct 02, 2020 at 05:45:52PM -0400, James Coleman wrote: > > >On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra > > ><tomas.von...@2ndquadrant.com> wrote: > > >> > > >> ... > > >> > > >> More importanly, it does not actually fix the issue - it does fix that > > >> particular query, but just replacing the DISTINCT with either ORDER BY > > >> or GROUP BY make it fail again :-( > > >> > > >> Attached is a simple script I used, demonstrating these issues for the > > >> three cases that expect to have ressortgroupref != 0 (per the comment > > >> before TargetEntry in plannodes.h). > > > > > >So does checking for volatile expressions (if you happened to test > > >that) solve all the cases? If you haven't tested that yet, I can try > > >to do that this evening. > > > > > > > Yes, it does fix all the three queries in the SQL script. > > > > The question however is whether this is the root issue, and whether it's > > the right way to fix it. For example - volatility is not the only reason > > that may block qual pushdown. If you look at qual_is_pushdown_safe, it > > also blocks pushdown of leaky functions in security_barrier views. So I > > wonder if that could cause failures too, somehow. But I haven't managed > > to create such example. > > I was about to say that the issue here is slightly different from qual > etc. pushdown, since we're not concerned about quals here, and so I > wonder where we determine what target list entries to put in a given > scan path, but then I realized that implies (maybe!) a simpler > solution. Instead of duplicating checks on target list entries would > be safe, why not check directly in get_useful_pathkeys_for_relation() > whether or not the pathkey has a target list entry? > > I haven't been able to try that out yet, and so maybe I'm missing > something, but I need to step out for a bit, so I'll have to look at > it later.
I've started poking at this, but haven't yet found a workable solution. See the attached patch which "solves" the problem by breaking putting the sort under the gather merge, but it at least demonstrates conceptually what I think we're interested in doing. The issue currently is that the comparison of expressions fails -- in our query, the first column selected shows up as a Var (with the same contents) but is a different pointer in the em_expr and the reltarget exprs list. I don't yet see a good way to get the equivalence class for a PathTarget entry. James
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c index b68a5a0ec7..454b3abe5c 100644 --- a/src/backend/optimizer/path/equivclass.c +++ b/src/backend/optimizer/path/equivclass.c @@ -777,8 +777,17 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel) foreach(lc_em, ec->ec_members) { EquivalenceMember *em = lfirst(lc_em); + PathTarget *target = rel->reltarget; + ListCell *lc_expr; + bool foundExpr = false; - if (bms_is_subset(em->em_relids, rel->relids) && + foreach(lc_expr, target->exprs) + { + if (lfirst(lc_expr) == em->em_expr) + foundExpr = true; + } + + if (foundExpr && bms_is_subset(em->em_relids, rel->relids) && !bms_is_empty(em->em_relids)) { /*