Re: enable_incremental_sort changes query behavior

James Coleman Sat, 03 Oct 2020 07:51:03 -0700

On Fri, Oct 2, 2020 at 11:16 PM James Coleman <jtc...@gmail.com> wrote:
>
> On Fri, Oct 2, 2020 at 7:07 PM James Coleman <jtc...@gmail.com> wrote:
> >
> > On Fri, Oct 2, 2020 at 6:28 PM Tomas Vondra
> > <tomas.von...@2ndquadrant.com> wrote:
> > >
> > > On Fri, Oct 02, 2020 at 05:45:52PM -0400, James Coleman wrote:
> > > >On Fri, Oct 2, 2020 at 4:56 PM Tomas Vondra
> > > ><tomas.von...@2ndquadrant.com> wrote:
> > > >>
> > > >> ...
> > > >>
> > > >> More importanly, it does not actually fix the issue - it does fix that
> > > >> particular query, but just replacing the DISTINCT with either ORDER BY
> > > >> or GROUP BY make it fail again :-(
> > > >>
> > > >> Attached is a simple script I used, demonstrating these issues for the
> > > >> three cases that expect to have ressortgroupref != 0 (per the comment
> > > >> before TargetEntry in plannodes.h).
> > > >
> > > >So does checking for volatile expressions (if you happened to test
> > > >that) solve all the cases? If you haven't tested that yet, I can try
> > > >to do that this evening.
> > > >
> > >
> > > Yes, it does fix all the three queries in the SQL script.
> > >
> > > The question however is whether this is the root issue, and whether it's
> > > the right way to fix it. For example - volatility is not the only reason
> > > that may block qual pushdown. If you look at qual_is_pushdown_safe, it
> > > also blocks pushdown of leaky functions in security_barrier views. So I
> > > wonder if that could cause failures too, somehow. But I haven't managed
> > > to create such example.
> >
> > I was about to say that the issue here is slightly different from qual
> > etc. pushdown, since we're not concerned about quals here, and so I
> > wonder where we determine what target list entries to put in a given
> > scan path, but then I realized that implies (maybe!) a simpler
> > solution. Instead of duplicating checks on target list entries would
> > be safe, why not check directly in get_useful_pathkeys_for_relation()
> > whether or not the pathkey has a target list entry?
> >
> > I haven't been able to try that out yet, and so maybe I'm missing
> > something, but I need to step out for a bit, so I'll have to look at
> > it later.
>
> I've started poking at this, but haven't yet found a workable
> solution. See the attached patch which "solves" the problem by
> breaking putting the sort under the gather merge, but it at least
> demonstrates conceptually what I think we're interested in doing.
>
> The issue currently is that the comparison of expressions fails -- in
> our query, the first column selected shows up as a Var (with the same
> contents) but is a different pointer in the em_expr and the reltarget
> exprs list. I don't yet see a good way to get the equivalence class
> for a PathTarget entry.


Hmm, I think I was looking to do is the attached patch. I didn't
realize until I did a lot of reading through source that we have an
equal() function that can compare exprs. That (plus the realization in
[1] the originally reported backtrace wasn't where the error actually
came from) convinced me that what we need is to confirm not only that
the all of the vars in the ec member come from the relids in the rel
but also that the expr is actually represented in the target of the
rel.

With the gucs changed as I mentioned earlier both of the plans (with
and without a volatile call in the 2nd select entry) now look like:

 Unique
   ->  Gather Merge
         Workers Planned: 2
         ->  Sort
               Sort Key: ref_0.i, (md5(ref_0.t))
               ->  Nested Loop
                     ->  Parallel Index Scan using ref_0_i_idx on ref_0
                     ->  Function Scan on generate_series ref_1

Without the gucs changed the minimal repro case now doesn't error, but
results in this plan:

 HashAggregate
   Group Key: ref_0.i, CASE WHEN pg_rotate_logfile_old() THEN ref_0.t
ELSE ref_0.t END
   ->  Nested Loop
         ->  Seq Scan on ref_0
         ->  Function Scan on generate_series ref_1

Similarly in your six queries I now only see parallel query showing up
in the last one.

I created an entirely new function because adding the target expr
lookup to the existing find_em_expr_for_rel() function broke a bunch
of postgres_fdw tests. That maybe raises questions about whether that
code also could have problems in theory/in the future, but I didn't
investigate further. In any case we already know it excludes
volatile...so maybe it's fine because in practice that's actually a
broader exclusion than what we're doing here.

This seems to fix the issue, but I'd like feedback on whether it's too
strict. We could of course just check em_has_volatile, but I'm
wondering if that's simultaneously too strict (by not allowing the
volatile expression to be computed in the gather merge supposing
there's no join) and too loose (maybe there are other cases we should
care about?). It also just strikes me as re-encoding rules that should
have already been applied (and thus we should be able to look up in
the data we have if it's safe to use the expr/pathkey). Those are
mostly intuitions though.

James

1: 
https://www.postgresql.org/message-id/CAAaqYe8zqDAv0Sfak5Riu%2BDKsm-i3ARPursn5v6qTwiCXmkXKQ%40mail.gmail.com

From 1b2d39d2119ab6bd09f6b984b4d3a7f46be090e9 Mon Sep 17 00:00:00 2001
From: James Coleman <jtc...@gmail.com>
Date: Sat, 3 Oct 2020 10:35:47 -0400
Subject: [PATCH v1] Fix get_useful_pathkeys_for_relation for volatile exprs
 below joins

It's not sufficient to find an ec member whose Vars are all in the rel
we're working with; volatile expressions in particular present a case
where we also need to know if the rel's target contains the expression
or not before we can assume the pathkey for that ec is safe to  use at
this point in the query. If the pathkey expr is volatile and we're
below a join, then the expr can't appear in the target until we're above
the join, so we can't sort with it yet.
---
 src/backend/optimizer/path/allpaths.c   | 11 +++----
 src/backend/optimizer/path/equivclass.c | 38 +++++++++++++++++++++++++
 src/include/optimizer/paths.h           |  1 +
 3 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index b399592ff8..53050b825b 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -2760,7 +2760,8 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 	/*
 	 * Considering query_pathkeys is always worth it, because it might allow
 	 * us to avoid a total sort when we have a partially presorted path
-	 * available.
+	 * available or to push the total sort into the parallel portion of the
+	 * query.
 	 */
 	if (root->query_pathkeys)
 	{
@@ -2773,17 +2774,17 @@ get_useful_pathkeys_for_relation(PlannerInfo *root, RelOptInfo *rel)
 			EquivalenceClass *pathkey_ec = pathkey->pk_eclass;
 
 			/*
-			 * We can only build an Incremental Sort for pathkeys which
-			 * contain an EC member in the current relation, so ignore any
+			 * We can only build a sort for pathkeys which
+			 * contain an EC member in the current relation's target, so ignore any
 			 * suffix of the list as soon as we find a pathkey without an EC
-			 * member the relation.
+			 * member in the relation.
 			 *
 			 * By still returning the prefix of the pathkeys list that does
 			 * meet criteria of EC membership in the current relation, we
 			 * enable not just an incremental sort on the entirety of
 			 * query_pathkeys but also incremental sort below a JOIN.
 			 */
-			if (!find_em_expr_for_rel(pathkey_ec, rel))
+			if (!find_em_expr_for_rel_target(pathkey_ec, rel))
 				break;
 
 			npathkeys++;
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index b68a5a0ec7..6cc358e60d 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -794,6 +794,44 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 	return NULL;
 }
 
+/*
+ * Find an equivalence class member expression that is in the
+ * the indicated relation's target.
+ */
+Expr *
+find_em_expr_for_rel_target(EquivalenceClass *ec, RelOptInfo *rel)
+{
+	ListCell   *lc_em;
+
+	foreach(lc_em, ec->ec_members)
+	{
+		EquivalenceMember *em = lfirst(lc_em);
+		PathTarget *target = rel->reltarget;
+		ListCell *lc_expr;
+		bool has_target_expr = false;
+
+		foreach(lc_expr, target->exprs)
+		{
+			if (equal(lfirst(lc_expr), em->em_expr))
+				has_target_expr = true;
+		}
+
+		if (has_target_expr && bms_is_subset(em->em_relids, rel->relids) &&
+			!bms_is_empty(em->em_relids))
+		{
+			/*
+			 * If there is more than one equivalence member whose Vars are
+			 * taken entirely from this relation, we'll be content to choose
+			 * any one of those.
+			 */
+			return em->em_expr;
+		}
+	}
+
+	/* We didn't find any suitable equivalence class expression */
+	return NULL;
+}
+
 /*
  * generate_base_implied_equalities
  *	  Generate any restriction clauses that we can deduce from equivalence
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 10b6e81079..6ddb54f415 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -135,6 +135,7 @@ extern EquivalenceClass *get_eclass_for_sort_expr(PlannerInfo *root,
 												  Relids rel,
 												  bool create_it);
 extern Expr *find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel);
+extern Expr *find_em_expr_for_rel_target(EquivalenceClass *ec, RelOptInfo *rel);
 extern void generate_base_implied_equalities(PlannerInfo *root);
 extern List *generate_join_implied_equalities(PlannerInfo *root,
 											  Relids join_relids,
-- 
2.17.1

Re: enable_incremental_sort changes query behavior

Reply via email to