I looked into the complaint at [1] about the planner being much stupider when one side of a JOIN USING is referenced than the other side. It seemed to me that that shouldn't be happening, because the relevant decisions are made on the basis of EquivalenceClasses and both USING columns should be in the same EquivalenceClass. I was right about that ... but the damage is being done far upstream of any EquivalenceClass work. It turns out that the core of the issue is that the query looks like
SELECT ... t1 JOIN t2 USING (x) GROUP BY x, t2.othercol ORDER BY t2.othercol LIMIT n In the "okay" case, we resolve "GROUP BY x" as GROUP BY t1.x. Later on, we are able to realize that ordering by t1.x is equivalent to ordering by t2.x (because same EquivalenceClass), and that it's equally good to consider the GROUP BY items in either order, and that ordering by t2.othercol, t2.x would allow us to perform the grouping using a GroupAggregate on data that's already sorted to meet the ORDER BY requirement. Since there happens to be an index on t2.othercol, t2.x, we can implement the query with no explicit sort, which wins big thanks to the small LIMIT. In the not-okay case, we resolve "GROUP BY x" as GROUP BY t2.x. Then remove_useless_groupby_columns notices that x is t2's primary key, so it figures that grouping by t2.othercol is redundant and throws away that element of GROUP BY. Now there is no apparent connection between the GROUP BY and ORDER BY lists, defeating the optimizations that would lead to a good choice of plan --- in fact, we conclude early on that that index's sort ordering is entirely useless to the query :-( I tried the attached quick-hack patch that just prevents remove_useless_groupby_columns from removing anything that appears in ORDER BY. That successfully fixes the complained-of case, and it doesn't change any existing regression test results. I'm not sure whether there are any cases that it makes significantly worse. (I also kind of wonder if the fundamental problem here is that remove_useless_groupby_columns is being done at the wrong time, and we ought to do it later when we have more semantic info. But I'm not volunteering to rewrite it completely.) Anyway, remove_useless_groupby_columns has been there since 9.6 and we've not previously seen reports of cases that it makes worse, so this seems like a corner-case problem. Hence I wouldn't risk back-patching this change. It seems worth considering for HEAD though, so I'll stick it in the September CF. regards, tom lane [1] https://www.postgresql.org/message-id/17544-ebd06b00b8836a04%40postgresql.org
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 06ad856eac..b2716abcc7 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -2733,12 +2733,15 @@ remove_useless_groupby_columns(PlannerInfo *root) /* * New list must include non-Vars, outer Vars, and anything not - * marked as surplus. + * marked as surplus. In addition, keep anything that appears in + * the ORDER BY clause, because otherwise we may falsely make it + * look like the GROUP BY and ORDER BY clauses are incompatible. */ if (!IsA(var, Var) || var->varlevelsup > 0 || !bms_is_member(var->varattno - FirstLowInvalidHeapAttributeNumber, - surplusvars[var->varno])) + surplusvars[var->varno]) || + list_member(parse->sortClause, sgc)) new_groupby = lappend(new_groupby, sgc); }