Add enable_presorted_aggregate GUC

David Rowley Thu, 15 Dec 2022 15:48:38 -0800

Over on [1] Pavel reports that PG16's performance is slower than
PG14's for his ORDER BY aggregate query.  This seems to be mainly down
to how the costing works for Incremental Sort. Now, ever since
1349d279, the planner will make a plan that's ordered by the GROUP BY
clause and add on the path key requirements for any ORDER BY
aggregates.  I had in mind that if someone already had tuned their
schema to have a btree index on the GROUP BY clause to allow Group
Aggregate to have presorted rows from the index that the planner would
likely just take that index and add an Incremental Sort on top of that
to obtain the rows in the order required for the aggregate functions.


The main reason for Pavel's reported regression is due to a cost
"pessimism factor" that was added to incremental sorts where we
multiply by 1.5 to reduce the chances of an Incremental Sort plan due
to uncertainties around the number of tuples per presorted group. We
assume each group will have the same number of tuples. If there's some
skew then there will be a larger N factor in the qsort complexity of
O(N * log2(N)).

Over on [1], I'm proposing that we remove that pessimism factor. If we
keep teaching the planner new tricks but cost them pessimistically
then we're not taking full advantage of said new tricks. If you
disagree with that, then best to raise it on [1].

On [1], in addition to removing the * 1.5 factor, I also propose that
we add a new GUC named "enable_presorted_aggregate", which, if turned
off will make the planner not request that the plan is also ordered by
the aggregate function's ORDER BY / DISTINCT clause.  The reason I
think that this is required is that even with removing the pessimism
factor from incremental sort, it's possible the planner will choose to
perform a full sort rather than an incremental sort on top of some
existing index which provides presorted input for only the query's
GROUP BY clause. e.g.

CREATE TABLE ab (a INT, b INT);
CREATE INDEX ON ab(a);

EXPLAIN SELECT a,array_agg(b ORDER BY b) FROM ab GROUP BY a;

Previous to 1349d279, the planner, assuming there's a good amount of
rows in the table, would be very likely to use the index for the GROUP
BY then the executor would be left to do a sort on "b" within
nodeAgg.c.  The equivalent of that post-1349d279 is Index Scan ->
Incremental Sort -> Group Aggregate (with presorted input).  However,
the planner could choose to: Seq Scan -> Sort (a,b) -> Group Aggregate
(with presorted input).  So this leaves an opportunity for the planner
to choose a worse plan.

Normally we add some enable_* GUC to leave an escape hatch when we add
some new feature like this. I likely should have done that when I
added 1349d279, but I didn't and I want to now.

I mainly just shifted this discussion out of [1] as we normally like
to debate GUC names and I feel that the discussion over on the other
thread is buried a little too deep to be visible to most people.

Can anyone think of a better name?  Or does anyone see error with my ambition?

David

[1] 
https://www.postgresql.org/message-id/9f61ddbf-2989-1536-b31e-6459370a6...@postgrespro.ru

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8e4145979d..9eedab652d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5311,6 +5311,29 @@ ANY <replaceable 
class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable-presorted-aggregate" 
xreflabel="enable_presorted_aggregate">
+      <term><varname>enable_presorted_aggregate</varname> 
(<type>boolean</type>)
+      <indexterm>
+       <primary><varname>enable_presorted_aggregate</varname> configuration 
parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Controls if the query planner will produce a plan which will provide
+        rows which are presorted in the order required for the query's
+        <literal>ORDER BY</literal> / <literal>DISTINCT</literal> aggregate
+        functions.  When disabled, the query planner will produce a plan which
+        will always require the executor to perform a sort before performing
+        aggregation of each aggregate function containing an
+        <literal>ORDER BY</literal> or <literal>DISTINCT</literal> clause.
+        When enabled, the planner will try to produce a more efficient plan
+        which provides input to the aggregate functions which is presorted in
+        the order they require for aggregation.  The default value is
+        <literal>on</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-seqscan" xreflabel="enable_seqscan">
       <term><varname>enable_seqscan</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index 0f0bcfb7e5..89d3c4352c 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -151,6 +151,7 @@ bool                enable_partitionwise_aggregate = false;
 bool           enable_parallel_append = true;
 bool           enable_parallel_hash = true;
 bool           enable_partition_pruning = true;
+bool           enable_presorted_aggregate = true;
 bool           enable_async_append = true;
 
 typedef struct
diff --git a/src/backend/optimizer/plan/planner.c 
b/src/backend/optimizer/plan/planner.c
index dfda251d95..e21e72eb87 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -3191,7 +3191,8 @@ make_pathkeys_for_groupagg(PlannerInfo *root, List 
*groupClause, List *tlist,
         * sets.  All handling specific to ordered aggregates must be done by 
the
         * executor in that case.
         */
-       if (root->numOrderedAggs == 0 || root->parse->groupingSets != NIL)
+       if (root->numOrderedAggs == 0 || root->parse->groupingSets != NIL ||
+               !enable_presorted_aggregate)
                return grouppathkeys;
 
        /*
diff --git a/src/backend/utils/misc/guc_tables.c 
b/src/backend/utils/misc/guc_tables.c
index 1bf14eec66..436afe1d21 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -971,6 +971,21 @@ struct config_bool ConfigureNamesBool[] =
                true,
                NULL, NULL, NULL
        },
+       {
+               {"enable_presorted_aggregate", PGC_USERSET, QUERY_TUNING_METHOD,
+                       gettext_noop("Enables the planner's ability to produce 
plans which "
+                                                "provide presorted input for 
ORDER BY / DISTINCT aggregate "
+                                                "functions."),
+                       gettext_noop("Allows the query planner to build plans 
which provide "
+                                                "presorted input for aggregate 
functions with an ORDER BY / "
+                                                "DISTINCT clause.  When 
disabled, implicit sorts are always "
+                                                "performed during execution."),
+                       GUC_EXPLAIN
+               },
+               &enable_presorted_aggregate,
+               true,
+               NULL, NULL, NULL
+       },
        {
                {"enable_async_append", PGC_USERSET, QUERY_TUNING_METHOD,
                        gettext_noop("Enables the planner's use of async append 
plans."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample 
b/src/backend/utils/misc/postgresql.conf.sample
index 043864597f..5afdeb04de 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -384,6 +384,7 @@
 #enable_partition_pruning = on
 #enable_partitionwise_join = off
 #enable_partitionwise_aggregate = off
+#enable_presorted_aggregate = on
 #enable_seqscan = on
 #enable_sort = on
 #enable_tidscan = on
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 204e94b6d1..b6cc2c9cd6 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -68,6 +68,7 @@ extern PGDLLIMPORT bool enable_partitionwise_aggregate;
 extern PGDLLIMPORT bool enable_parallel_append;
 extern PGDLLIMPORT bool enable_parallel_hash;
 extern PGDLLIMPORT bool enable_partition_pruning;
+extern PGDLLIMPORT bool enable_presorted_aggregate;
 extern PGDLLIMPORT bool enable_async_append;
 extern PGDLLIMPORT int constraint_exclusion;
 
diff --git a/src/test/regress/expected/aggregates.out 
b/src/test/regress/expected/aggregates.out
index fc2bd40be2..309c2dc865 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -1471,6 +1471,17 @@ group by ten;
          ->  Seq Scan on tenk1
 (5 rows)
 
+-- Ensure no ordering is requested when enable_presorted_aggregate is off
+set enable_presorted_aggregate to off;
+explain (costs off)
+select sum(two order by two) from tenk1;
+       QUERY PLAN        
+-------------------------
+ Aggregate
+   ->  Seq Scan on tenk1
+(2 rows)
+
+reset enable_presorted_aggregate;
 --
 -- Test combinations of DISTINCT and/or ORDER BY
 --
diff --git a/src/test/regress/expected/sysviews.out 
b/src/test/regress/expected/sysviews.out
index 579b861d84..001c6e7eb9 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -128,10 +128,11 @@ select name, setting from pg_settings where name like 
'enable%';
  enable_partition_pruning       | on
  enable_partitionwise_aggregate | off
  enable_partitionwise_join      | off
+ enable_presorted_aggregate     | on
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(20 rows)
+(21 rows)
 
 -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
 -- more-or-less working.  We can't test their contents in any great detail
diff --git a/src/test/regress/sql/aggregates.sql 
b/src/test/regress/sql/aggregates.sql
index a4c00ff7a9..15f6be8e15 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -546,6 +546,12 @@ select
 from tenk1
 group by ten;
 
+-- Ensure no ordering is requested when enable_presorted_aggregate is off
+set enable_presorted_aggregate to off;
+explain (costs off)
+select sum(two order by two) from tenk1;
+reset enable_presorted_aggregate;
+
 --
 -- Test combinations of DISTINCT and/or ORDER BY
 --

Add enable_presorted_aggregate GUC

Reply via email to