[HACKERS] Aggregates push-down to partitions

Konstantin Knizhnik Thu, 09 Nov 2017 09:15:34 -0800

There is a huge thread concerning pushing-down aggregates to FDW:


https://www.postgresql.org/message-id/flat/CAFjFpRcnueviDpngJ3QSVvj7oyukr9NkSiCspqd4N%2BdCEdvYvg%40mail.gmail.com#cafjfprcnuevidpngj3qsvvj7oyukr9nksicspqd4n+dcedv...@mail.gmail.com

but as far as I understand nothing is done for efficient calculation ofaggregates for partitioned table.In case of local partitions it is somehow compensated by parallel queryplan:


postgres=# create table base(x integer);
CREATE TABLE
postgres=# create table derived1() inherits (base);
CREATE TABLE
postgres=# create table derived2() inherits (base);
CREATE TABLE
postgres=# insert into derived1  values (generate_series(1,1000000));
INSERT 0 1000000
postgres=# insert into derived2  values (generate_series(1,1000000));
INSERT 0 1000000
postgres=# explain select sum(x) from base;
                                           QUERY PLAN
-------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=12176.63..12176.64 rows=1 width=8)
   ->  Gather  (cost=12176.59..12176.61 rows=8 width=8)
         Workers Planned: 8
         ->  Partial Aggregate  (cost=12175.59..12175.60 rows=1 width=8)
               ->  Append  (cost=0.00..11510.47 rows=266048 width=4)

-> Parallel Seq Scan on base (cost=0.00..0.00rows=1 width=4) -> Parallel Seq Scan on derived1(cost=0.00..5675.00 rows=125000 width=4) -> Parallel Seq Scan on derived2(cost=0.00..5835.47 rows=141047 width=4)

(8 rows)

It is still far from ideal plan because each worker is working with allpartitions, instead of spitting partitions between workers and calculatepartial aggregates for each partition.

But if we add FDW as a child of parent table, then parallel scan can notbe used and we get the worst possible plan:

postgres=# create foreign table derived_fdw() inherits(base) serverpg_fdw options (table_name 'derived1');CREATE FOREIGN TABLE

postgres=# explain select sum(x) from base;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Aggregate  (cost=34055.07..34055.08 rows=1 width=8)
   ->  Append  (cost=0.00..29047.75 rows=2002926 width=4)
         ->  Seq Scan on base  (cost=0.00..0.00 rows=1 width=4)

-> Seq Scan on derived1 (cost=0.00..14425.00 rows=1000000width=4) -> Seq Scan on derived2 (cost=0.00..14425.00 rows=1000000width=4) -> Foreign Scan on derived_fdw (cost=100.00..197.75rows=2925 width=4)

(6 rows)

So we sequentially pull all data to this node and compute aggregateslocally.Ideal plan will calculate in parallel partial aggregates at all nodesand then combine partial results.

It requires two changes:

1. Replace Aggregate->Append withFinalize_Aggregate->Append->Partial_Aggregate2. Concurrent execution of Append. It also can be done in two differentways: we can try to use existed parallel workers infrastructure andreplace Append with Gather. It seems to be the best approach for localpartitioning. In case of remote (FDW) partitions, it is enoughto split starting of execution (PQsendQuery in postgres_fdw) and gettingresults. So it requires some changes in FDW protocol.

I wonder if somebody already investigate this problem or working in thisdirection.

May be there are already some patches proposed?
I have searched hackers archive, but didn't find something relevant...
Are there any suggestions about the best approach to implement this feature?

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Aggregates push-down to partitions

Reply via email to