On 4/15/21 7:35 PM, James Coleman wrote:
> On Thu, Apr 15, 2021 at 5:33 AM Luc Vlaming <l...@swarm64.com> wrote:
>>
>> On 15-04-2021 04:01, James Coleman wrote:
>>> On Wed, Apr 14, 2021 at 5:42 PM James Coleman <jtc...@gmail.com> wrote:
>>>>
>>>> On Mon, Apr 12, 2021 at 8:37 AM Tomas Vondra
>>>> <tomas.von...@enterprisedb.com> wrote:
>>>>>
>>>>> On 4/12/21 2:24 PM, Luc Vlaming wrote:
>>>>>> Hi,
>>>>>>
>>>>>> When trying to run on master (but afaik also PG-13) TPC-DS queries 94,
>>>>>> 95 and 96 on a SF10 I get the error "could not find pathkey item to 
>>>>>> sort".
>>>>>> When I disable enable_gathermerge the problem goes away and then the
>>>>>> plan for query 94 looks like below. I tried figuring out what the
>>>>>> problem is but to be honest I would need some pointers as the code that
>>>>>> tries to matching equivalence members in prepare_sort_from_pathkeys is
>>>>>> something i'm really not familiar with.
>>>>>>
>>>>>
>>>>> Could be related to incremental sort, which allowed some gather merge
>>>>> paths that were impossible before. We had a couple issues related to
>>>>> that fixed in November, IIRC.
>>>>>
>>>>>> To reproduce you can either ingest and test using the toolkit I used too
>>>>>> (see https://github.com/swarm64/s64da-benchmark-toolkit/), or
>>>>>> alternatively just use the schema (see
>>>>>> https://github.com/swarm64/s64da-benchmark-toolkit/tree/master/benchmarks/tpcds/schemas/psql_native)
>>>>>>
>>>>>
>>>>> Thanks, I'll see if I can reproduce that with your schema.
>>>>>
>>>>>
>>>>> regards
>>>>>
>>>>> --
>>>>> Tomas Vondra
>>>>> EnterpriseDB: http://www.enterprisedb.com
>>>>> The Enterprise PostgreSQL Company
>>>>
>>>> The query in question is:
>>>>
>>>> select  count(*)
>>>>          from store_sales
>>>>              ,household_demographics
>>>>              ,time_dim, store
>>>>          where ss_sold_time_sk = time_dim.t_time_sk
>>>>              and ss_hdemo_sk = household_demographics.hd_demo_sk
>>>>              and ss_store_sk = s_store_sk
>>>>              and time_dim.t_hour = 15
>>>>              and time_dim.t_minute >= 30
>>>>              and household_demographics.hd_dep_count = 7
>>>>              and store.s_store_name = 'ese'
>>>>          order by count(*)
>>>>          limit 100;
>>>>
>>>>  From debugging output it looks like this is the plan being chosen
>>>> (cheapest total path):
>>>>          Gather(store_sales household_demographics time_dim) rows=60626
>>>> cost=3145.73..699910.15
>>>>                  HashJoin(store_sales household_demographics time_dim)
>>>> rows=25261 cost=2145.73..692847.55
>>>>                    clauses: store_sales.ss_hdemo_sk =
>>>> household_demographics.hd_demo_sk
>>>>                          HashJoin(store_sales time_dim) rows=252609
>>>> cost=1989.73..692028.08
>>>>                            clauses: store_sales.ss_sold_time_sk =
>>>> time_dim.t_time_sk
>>>>                                  SeqScan(store_sales) rows=11998564
>>>> cost=0.00..658540.64
>>>>                                  SeqScan(time_dim) rows=1070
>>>> cost=0.00..1976.35
>>>>                          SeqScan(household_demographics) rows=720
>>>> cost=0.00..147.00
>>>>
>>>> prepare_sort_from_pathkeys fails to find a pathkey because
>>>> tlist_member_ignore_relabel returns null -- which seemed weird because
>>>> the sortexpr is an Aggref (in a single member equivalence class) and
>>>> the tlist contains a single member that's also an Aggref. It turns out
>>>> that the only difference between the two Aggrefs is that the tlist
>>>> entry has "aggsplit = AGGSPLIT_INITIAL_SERIAL" while the sortexpr has
>>>> aggsplit = AGGSPLIT_SIMPLE.
>>>>
>>>> That's as far as I've gotten so far, but I figured I'd get that info
>>>> out to see if it means anything obvious to anyone else.
>>>
>>> This really goes back to [1] where we fixed a similar issue by making
>>> find_em_expr_usable_for_sorting_rel parallel the rules in
>>> prepare_sort_from_pathkeys.
>>>
>>> Most of those conditions got copied, and the case we were trying to
>>> handle is the fact that prepare_sort_from_pathkeys can generate a
>>> target list entry under those conditions if one doesn't exist. However
>>> there's a further restriction there I don't remember looking at: it
>>> uses pull_var_clause and tlist_member_ignore_relabel to ensure that
>>> all of the vars that feed into the sort expression are found in the
>>> target list. As I understand it, that is: it will build a target list
>>> entry for something like "md5(column)" if "column" (and that was one
>>> of our test cases for the previous fix) is in the target list already.
>>>
>>> But there's an additional detail here: the call to pull_var_clause
>>> requests aggregates, window functions, and placeholders be treated as
>>> vars. That means for our Aggref case it would require that the two
>>> Aggrefs be fully equal, so the differing aggsplit member would cause a
>>> target list entry not to be built, hence our error here.
>>>
>>> I've attached a quick and dirty patch that encodes that final rule
>>> from prepare_sort_from_pathkeys into
>>> find_em_expr_usable_for_sorting_rel. I can't help but think that
>>> there's a cleaner way to do with this with less code duplication, but
>>> hindering that is that prepare_sort_from_pathkeys is working with a
>>> TargetList while find_em_expr_usable_for_sorting_rel is working with a
>>> list of expressions.
>>>

Yeah, I think it'll be difficult to reuse code from later planner stages
exactly because it operates on different representation. So something
like your patch is likely necessary.

As for the patch, I have a couple comments:

1) expr_list_member_ignore_relabel would deserve a better comment, and
maybe a reference to tlist_member_ignore_relabel which it copies

2) I suppose the comment before "if (ec->ec_has_volatile)" needs
updating, because now it says we're done as long as the expression is
not volatile (but we're doing more stuff).

3) Shouldn't find_em_expr_usable_for_sorting_rel now mostly mimic what
prepare_sort_from_pathkeys does? That is, try to match the entries
directly first, before the new pull_vars() business?

4) I've simplified the foreach() loop a bit. prepare_sort_from_pathkeys
does it differently, but that's because there are multiple foreach
levels, I think. Yes, we'll not free the list, but I that's what most
other places in planner do ...

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
>From 57fa7a782fb1b7456c37ed64aded6410cd876258 Mon Sep 17 00:00:00 2001
From: jcoleman <jtc...@gmail.com>
Date: Thu, 15 Apr 2021 01:48:36 +0000
Subject: [PATCH 1/2] Fix find_em_expr_usable_for_sorting_rel

---
 src/backend/optimizer/path/equivclass.c       | 44 ++++++++++++++++++-
 .../regress/expected/incremental_sort.out     | 26 +++++++++++
 src/test/regress/sql/incremental_sort.sql     |  7 +++
 3 files changed, 75 insertions(+), 2 deletions(-)

diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 0188c1e9a1..27867e16e8 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -68,6 +68,7 @@ static Bitmapset *get_eclass_indexes_for_relids(PlannerInfo *root,
 												Relids relids);
 static Bitmapset *get_common_eclass_indexes(PlannerInfo *root, Relids relids1,
 											Relids relids2);
+static Expr* expr_list_member_ignore_relabel(Expr *node, List *exprlist);
 
 
 /*
@@ -798,6 +799,27 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 	return NULL;
 }
 
+static Expr*
+expr_list_member_ignore_relabel(Expr *node, List *exprlist)
+{
+	ListCell   *temp;
+
+	while (node && IsA(node, RelabelType))
+		node = ((RelabelType *) node)->arg;
+
+	foreach(temp, exprlist)
+	{
+		Expr	   *tlexpr = (Expr *) lfirst(temp);
+
+		while (tlexpr && IsA(tlexpr, RelabelType))
+			tlexpr = ((RelabelType *) tlexpr)->arg;
+
+		if (equal(node, tlexpr))
+			return node;
+	}
+	return NULL;
+}
+
 /*
  * Find an equivalence class member expression that can be safely used to build
  * a sort node using the provided relation. The rules are a subset of those
@@ -819,6 +841,9 @@ find_em_expr_usable_for_sorting_rel(PlannerInfo *root, EquivalenceClass *ec,
 	{
 		EquivalenceMember *em = lfirst(lc_em);
 		Expr	   *em_expr = em->em_expr;
+		List	   *exprvars;
+		ListCell   *k;
+		PathTarget *target = rel->reltarget;
 
 		/*
 		 * We shouldn't be trying to sort by an equivalence class that
@@ -858,8 +883,23 @@ find_em_expr_usable_for_sorting_rel(PlannerInfo *root, EquivalenceClass *ec,
 		 * valuable here to expend the work trying to find a match in the
 		 * target's exprs since such a sort will have to be postponed anyway.
 		 */
-		if (!ec->ec_has_volatile)
-			return em->em_expr;
+		if (ec->ec_has_volatile)
+			return NULL;
+
+
+		exprvars = pull_var_clause((Node *) em_expr,
+								   PVC_INCLUDE_AGGREGATES |
+								   PVC_INCLUDE_WINDOWFUNCS |
+								   PVC_INCLUDE_PLACEHOLDERS);
+		foreach(k, exprvars)
+		{
+			if (!expr_list_member_ignore_relabel(lfirst(k), target->exprs))
+				break;
+		}
+		list_free(exprvars);
+
+		if (!k) /* If we made it through exprvars finding a tlist member for each */
+			return em->em_expr; /* found usable expression */
 	}
 
 	/* We didn't find any suitable equivalence class expression */
diff --git a/src/test/regress/expected/incremental_sort.out b/src/test/regress/expected/incremental_sort.out
index a417b566d9..fd18af4c0c 100644
--- a/src/test/regress/expected/incremental_sort.out
+++ b/src/test/regress/expected/incremental_sort.out
@@ -1579,6 +1579,32 @@ order by 1, 2;
    ->  Function Scan on generate_series
 (7 rows)
 
+-- Parallel sort with an aggregate that be safely generated partially in parallel,
+-- but we can't sort by the partial aggregates.
+explain (costs off) select count(*)
+from tenk1 t1
+join tenk1 t2 on t1.unique1 = t2.unique2
+join tenk1 t3 on t2.unique1 = t3.unique1
+order by count(*);
+                                          QUERY PLAN                                           
+-----------------------------------------------------------------------------------------------
+ Sort
+   Sort Key: (count(*))
+   ->  Finalize Aggregate
+         ->  Gather
+               Workers Planned: 2
+               ->  Partial Aggregate
+                     ->  Parallel Hash Join
+                           Hash Cond: (t2.unique1 = t3.unique1)
+                           ->  Parallel Hash Join
+                                 Hash Cond: (t1.unique1 = t2.unique2)
+                                 ->  Parallel Index Only Scan using tenk1_unique1 on tenk1 t1
+                                 ->  Parallel Hash
+                                       ->  Parallel Index Scan using tenk1_unique2 on tenk1 t2
+                           ->  Parallel Hash
+                                 ->  Parallel Index Only Scan using tenk1_unique1 on tenk1 t3
+(15 rows)
+
 -- Parallel sort but with expression (correlated subquery) that
 -- is prohibited in parallel plans.
 explain (costs off) select distinct
diff --git a/src/test/regress/sql/incremental_sort.sql b/src/test/regress/sql/incremental_sort.sql
index 81429164d4..67748bb64b 100644
--- a/src/test/regress/sql/incremental_sort.sql
+++ b/src/test/regress/sql/incremental_sort.sql
@@ -257,6 +257,13 @@ from tenk1, lateral (select tenk1.unique1 from generate_series(1, 1000)) as sub;
 explain (costs off) select sub.unique1, md5(stringu1)
 from tenk1, lateral (select tenk1.unique1 from generate_series(1, 1000)) as sub
 order by 1, 2;
+-- Parallel sort with an aggregate that be safely generated partially in parallel,
+-- but we can't sort by the partial aggregates.
+explain (costs off) select count(*)
+from tenk1 t1
+join tenk1 t2 on t1.unique1 = t2.unique2
+join tenk1 t3 on t2.unique1 = t3.unique1
+order by count(*);
 -- Parallel sort but with expression (correlated subquery) that
 -- is prohibited in parallel plans.
 explain (costs off) select distinct
-- 
2.30.2

>From 68740de04e3bb21b15ed239def61474428a79cf4 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <to...@2ndquadrant.com>
Date: Fri, 16 Apr 2021 03:09:48 +0200
Subject: [PATCH 2/2] tweaks

---
 src/backend/optimizer/path/equivclass.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 27867e16e8..bf1c95c847 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -799,6 +799,14 @@ find_em_expr_for_rel(EquivalenceClass *ec, RelOptInfo *rel)
 	return NULL;
 }
 
+/*
+ * Checks if node is member of the list. RelabelType is ignored for expressions
+ * in the list.
+ *
+ * XXX We have list_member_XXX for different types, so maybe make that consistent?
+ * XXX Perhaps explain why the relabel is ignored. Why is it ignored recursively?
+ * XXX Maybe reference tlist_member_ignore_relabel, which is doing the same thing?
+ */
 static Expr*
 expr_list_member_ignore_relabel(Expr *node, List *exprlist)
 {
@@ -882,24 +890,31 @@ find_em_expr_usable_for_sorting_rel(PlannerInfo *root, EquivalenceClass *ec,
 		 * up a volatile expression to the proper tlist entry, it doesn't seem
 		 * valuable here to expend the work trying to find a match in the
 		 * target's exprs since such a sort will have to be postponed anyway.
+		 *
+		 * XXX Doesn't this comment need an update?
+		 *
+		 * XXX Maybe we should start matching the tlist entries? After all,
+		 * prepare_sort_from_pathkeys matches directly, and only if not found
+		 * it does the pull_var_clause stuff, no?
 		 */
 		if (ec->ec_has_volatile)
 			return NULL;
 
-
+		/* XXX comment would be nice ... */
 		exprvars = pull_var_clause((Node *) em_expr,
 								   PVC_INCLUDE_AGGREGATES |
 								   PVC_INCLUDE_WINDOWFUNCS |
 								   PVC_INCLUDE_PLACEHOLDERS);
+
+		/* All expressions have to be present in the target list. */
 		foreach(k, exprvars)
 		{
 			if (!expr_list_member_ignore_relabel(lfirst(k), target->exprs))
-				break;
+				return NULL;
 		}
-		list_free(exprvars);
 
-		if (!k) /* If we made it through exprvars finding a tlist member for each */
-			return em->em_expr; /* found usable expression */
+		/* all the expressions have a matching target entry. */
+		return em->em_expr;
 	}
 
 	/* We didn't find any suitable equivalence class expression */
-- 
2.30.2

Reply via email to