Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

Jeevan Chalke Mon, 24 Oct 2016 03:47:55 -0700

On Sat, Oct 22, 2016 at 9:09 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:


> brolga is still not terribly happy with this patch: it's choosing not to
> push down the aggregates in one of the queries.  While I failed to
> duplicate that result locally, investigation suggests that brolga's result
> is perfectly sane; in fact it's not very clear why we're not getting that
> from multiple critters, because the plan brolga is choosing is not
> inferior to the expected one.
>
> The core of the problem is this subquery:
>
> contrib_regression=# explain verbose select min(13), avg(ft1.c1),
> sum(ft2.c1) from ft1 right join ft2 on (ft1.c1 = ft2.c1) where ft1.c1 = 12;
>
>  QUERY PLAN
> ------------------------------------------------------------
> ------------------------------------------------------------
> ---------------------------------
>  Foreign Scan  (cost=108.61..108.64 rows=1 width=44)
>    Output: (min(13)), (avg(ft1.c1)), (sum(ft2.c1))
>    Relations: Aggregate on ((public.ft1) INNER JOIN (public.ft2))
>    Remote SQL: SELECT min(13), avg(r1."C 1"), sum(r2."C 1") FROM ("S 1"."T
> 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r2."C 1" = 12)) AND ((r1."C 1" =
> 12))))
> (4 rows)
>
> If you look at the estimate to just fetch the data, it's:
>
> contrib_regression=# explain verbose select ft1.c1, ft2.c1 from ft1 right
> join ft2 on (ft1.c1 = ft2.c1) where ft1.c1 = 12;
>                                                               QUERY PLAN
> ------------------------------------------------------------
> --------------------------------------------------------------------------
>  Foreign Scan  (cost=100.55..108.62 rows=1 width=8)
>    Output: ft1.c1, ft2.c1
>    Relations: (public.ft1) INNER JOIN (public.ft2)
>    Remote SQL: SELECT r1."C 1", r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN
> "S 1"."T 1" r2 ON (((r2."C 1" = 12)) AND ((r1."C 1" = 12))))
> (4 rows)
>
> Note we're expecting only one row out of the join.  Now the cost of doing
> three aggregates on a single row of input is not a lot.  Comparing these
> local queries:
>
> regression=# explain select min(13),avg(q1),sum(q2) from int8_tbl where
> q2=456;
>                           QUERY PLAN
> ---------------------------------------------------------------
>  Aggregate  (cost=1.07..1.08 rows=1 width=68)
>    ->  Seq Scan on int8_tbl  (cost=0.00..1.06 rows=1 width=16)
>          Filter: (q2 = 456)
> (3 rows)
>
> regression=# explain select (q1),(q2) from int8_tbl where q2=456;
>                        QUERY PLAN
> ---------------------------------------------------------
>  Seq Scan on int8_tbl  (cost=0.00..1.06 rows=1 width=16)
>    Filter: (q2 = 456)
> (2 rows)
>
> we seem to have startup = input cost + .01 and then another .01
> for total.  So the estimate to do the above remote scan and then
> aggregate locally should have been 108.63 startup and 108.64 total,
> give or take.  The estimate for aggregating remotely is a hair better,
> but it's not nearly better enough to guarantee that the planner won't
> see it as fuzzily the same cost.
>
> In short: the problem with this test case is that it's considering
> aggregation over only a single row, which is a situation in which
> pushing down the aggregate actually doesn't save us anything, because
> we're retrieving one row from the remote either way.  So it's not at all
> surprising that we don't get a stable plan choice.  The test query needs
> to be adjusted so that the aggregation is done over multiple rows,
> allowing fdw_tuple_cost to kick in and provide some daylight between
> the cost estimates.
>

Attached patch which performs aggrgation over 1000 rows as suggested by Tom.

I believe it will have a stable output/plan now.

Thanks


>
>                         regards, tom lane
>



-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

agg_push_down_fix_testcase.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Aggregate Push Down - Performing aggregation on foreign server

Reply via email to