On 4/12/24 06:44, Tom Lane wrote:
* It seems like root->processed_groupClause no longer has much to do
with the grouping behavior, which is scary given how much code still
believes that it does.  I suspect there are bugs lurking there, and
1. Analysing the code, processed_groupClause is used in:
- adjust_foreign_grouping_path_cost - to decide, do we need to add cost of sort to the foreign grouping.
- just for replacing relids during SJE process.
- create_groupingsets_plan(), preprocess_grouping_sets, remove_useless_groupby_columns - we don't apply this feature in the case of grouping sets. - get_number_of_groups - estimation shouldn't depend on the order of clauses.
- create_grouping_paths - to analyse hashability of grouping clause
- make_group_input_target, make_partial_grouping_target - doesn't depend on the order of clauses planner.c: add_paths_to_grouping_rel, create_partial_grouping_paths - in the case of hash grouping.

So, the only case we can impact current behavior lies in the postgres_fdw. But here we don't really know which plan will be chosen during planning on foreign node and stay the same behavior is legal for me.
am not comforted by the fact that the patch changed exactly nothing
in the pathnodes.h documentation of that field.  This comment looks
pretty silly now too:

             /* Preprocess regular GROUP BY clause, if any */
             root->processed_groupClause = list_copy(parse->groupClause);

What "preprocessing" is going on there?  This comment was adequate
before, when the code line invoked preprocess_groupclause which had
a bunch of commentary; but now it has to stand on its own and it's
not doing a great job of that.
Thanks for picking this place. I fixed it.
More interesting here is that we move decisions on the order of group-by clauses to the stage, where we already have underlying subtree and incoming path keys. In principle, we could return the preprocessing routine and arrange GROUP-BY with the ORDER-BY clause as it was before. But looking into the code, I found only one reason to do this: adjust_group_pathkeys_for_groupagg. Curious about how much profit we get here, I think we can discover it later with no hurry. A good outcome here will be a test case that can show the importance of arranging GROUP-BY and ORDER-BY at an early stage.

--
regards,
Andrei Lepikhov
Postgres Professional
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5ea3705796..861656a192 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1424,7 +1424,10 @@ grouping_planner(PlannerInfo *root, double tuple_fraction,
 		}
 		else if (parse->groupClause)
 		{
-			/* Preprocess regular GROUP BY clause, if any */
+			/*
+			 * Make a copy of origin groupClause because we are going to
+			 * remove redundant clauses.
+			 */
 			root->processed_groupClause = list_copy(parse->groupClause);
 			/* Remove any redundant GROUP BY columns */
 			remove_useless_groupby_columns(root);

Reply via email to