brolga is still not terribly happy with this patch: it's choosing not to push down the aggregates in one of the queries. While I failed to duplicate that result locally, investigation suggests that brolga's result is perfectly sane; in fact it's not very clear why we're not getting that from multiple critters, because the plan brolga is choosing is not inferior to the expected one.
The core of the problem is this subquery: contrib_regression=# explain verbose select min(13), avg(ft1.c1), sum(ft2.c1) from ft1 right join ft2 on (ft1.c1 = ft2.c1) where ft1.c1 = 12; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------- Foreign Scan (cost=108.61..108.64 rows=1 width=44) Output: (min(13)), (avg(ft1.c1)), (sum(ft2.c1)) Relations: Aggregate on ((public.ft1) INNER JOIN (public.ft2)) Remote SQL: SELECT min(13), avg(r1."C 1"), sum(r2."C 1") FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r2."C 1" = 12)) AND ((r1."C 1" = 12)))) (4 rows) If you look at the estimate to just fetch the data, it's: contrib_regression=# explain verbose select ft1.c1, ft2.c1 from ft1 right join ft2 on (ft1.c1 = ft2.c1) where ft1.c1 = 12; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------- Foreign Scan (cost=100.55..108.62 rows=1 width=8) Output: ft1.c1, ft2.c1 Relations: (public.ft1) INNER JOIN (public.ft2) Remote SQL: SELECT r1."C 1", r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r2."C 1" = 12)) AND ((r1."C 1" = 12)))) (4 rows) Note we're expecting only one row out of the join. Now the cost of doing three aggregates on a single row of input is not a lot. Comparing these local queries: regression=# explain select min(13),avg(q1),sum(q2) from int8_tbl where q2=456; QUERY PLAN --------------------------------------------------------------- Aggregate (cost=1.07..1.08 rows=1 width=68) -> Seq Scan on int8_tbl (cost=0.00..1.06 rows=1 width=16) Filter: (q2 = 456) (3 rows) regression=# explain select (q1),(q2) from int8_tbl where q2=456; QUERY PLAN --------------------------------------------------------- Seq Scan on int8_tbl (cost=0.00..1.06 rows=1 width=16) Filter: (q2 = 456) (2 rows) we seem to have startup = input cost + .01 and then another .01 for total. So the estimate to do the above remote scan and then aggregate locally should have been 108.63 startup and 108.64 total, give or take. The estimate for aggregating remotely is a hair better, but it's not nearly better enough to guarantee that the planner won't see it as fuzzily the same cost. In short: the problem with this test case is that it's considering aggregation over only a single row, which is a situation in which pushing down the aggregate actually doesn't save us anything, because we're retrieving one row from the remote either way. So it's not at all surprising that we don't get a stable plan choice. The test query needs to be adjusted so that the aggregation is done over multiple rows, allowing fdw_tuple_cost to kick in and provide some daylight between the cost estimates. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers