Over on [1] I was complaining that I thought DEFAULT_FDW_TUPLE_COST, which is defined as 0.01 was unrealistically low.
For comparison, cpu_tuple_cost, something we probably expect to be in a CPU cache is also 0.01. We've defined DEFAULT_PARALLEL_TUPLE_COST to be 0.1, which is 10x cpu_tuple_cost. That's coming from a shared memory segment. So why do we think DEFAULT_FDW_TUPLE_COST should be the same as cpu_tuple_cost when that's probably pulling a tuple from some remote server over some (possibly slow) network? I did a little experiment in the attached .sql file and did some maths to try to figure out what it's really likely to be costing us. I tried this with and without the attached hack to have the planner not consider remote grouping just to see how much slower pulling a million tuples through the FDW would cost. I setup a loopback server on localhost (which has about the lowest possible network latency) and found the patched query to the foreign server took: Execution Time: 530.000 ms This is pulling all million tuples over and doing the aggregate locally. Unpatched, the query took: Execution Time: 35.334 ms so about 15x faster. If I take the seqscan cost for querying the local table, which is 14425.00 multiply that by 15 (the extra time it took to pull the 1 million tuples) then divide by 1 million to get the extra cost per tuple, then that comes to about 0.216. So that says DEFAULT_FDW_TUPLE_COST is about 21x lower than it should be. I tried cranking DEFAULT_FDW_TUPLE_COST up to 0.5 to see what plans would change in the postgres_fdw regression tests and quite a number changed. Many seem to be pushing the sorts down to the remote server where they were being done locally before. A few others just seem weird. For example, the first one seems to be blindly adding a remote sort when it does no good. I think it would take quite a bit of study with a debugger to figure out what's going on with many of these. Does anyone have any ideas why DEFAULT_FDW_TUPLE_COST was set so low? Does anyone object to it being set to something more realistic? David [1] https://www.postgresql.org/message-id/CAApHDvpXiXLxg4TsA8P_4etnuGQqAAbHWEOM4hGe=dcaxmi...@mail.gmail.com
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 64632db73c..b4e3b91d7f 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -3921,7 +3921,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel, * If there is an FDW that's responsible for all baserels of the query, * let it consider adding ForeignPaths. */ - if (grouped_rel->fdwroutine && + if (0 && grouped_rel->fdwroutine && grouped_rel->fdwroutine->GetForeignUpperPaths) grouped_rel->fdwroutine->GetForeignUpperPaths(root, UPPERREL_GROUP_AGG, input_rel, grouped_rel,
fdw_experiment.sql
Description: Binary data
--- "expected\\postgres_fdw.out" 2022-08-03 01:34:42.806967000 +1200 +++ "results\\postgres_fdw.out" 2022-08-03 02:33:40.719712900 +1200 @@ -2164,8 +2164,8 @@ -- unsafe conditions on one side (c8 has a UDT), not pushed down. EXPLAIN (VERBOSE, COSTS OFF) SELECT t1.c1, t2.c1 FROM ft1 t1 LEFT JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10; - QUERY PLAN ------------------------------------------------------------------------------ + QUERY PLAN +------------------------------------------------------------------------------------------------------------------------------ Limit Output: t1.c1, t2.c1, t1.c3 -> Sort @@ -2182,7 +2182,7 @@ -> Foreign Scan on public.ft1 t1 Output: t1.c1, t1.c3 Filter: (t1.c8 = 'foo'::user_enum) - Remote SQL: SELECT "C 1", c3, c8 FROM "S 1"."T 1" + Remote SQL: SELECT "C 1", c3, c8 FROM "S 1"."T 1" ORDER BY c3 ASC NULLS LAST, "C 1" ASC NULLS LAST (17 rows) SELECT t1.c1, t2.c1 FROM ft1 t1 LEFT JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10; @@ -2873,13 +2873,13 @@ Sort Output: (sum(c1)), c2 Sort Key: (sum(ft1.c1)) - -> HashAggregate + -> GroupAggregate Output: sum(c1), c2 Group Key: ft1.c2 Filter: (avg((ft1.c1 * ((random() <= '1'::double precision))::integer)) > '100'::numeric) -> Foreign Scan on public.ft1 Output: c1, c2 - Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" + Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" ORDER BY c2 ASC NULLS LAST (10 rows) -- Remote aggregate in combination with a local Param (for the output @@ -3123,12 +3123,12 @@ Sort Output: (sum(c1) FILTER (WHERE ((((c1 / c1))::double precision * random()) <= '1'::double precision))), c2 Sort Key: (sum(ft1.c1) FILTER (WHERE ((((ft1.c1 / ft1.c1))::double precision * random()) <= '1'::double precision))) - -> HashAggregate + -> GroupAggregate Output: sum(c1) FILTER (WHERE ((((c1 / c1))::double precision * random()) <= '1'::double precision)), c2 Group Key: ft1.c2 -> Foreign Scan on public.ft1 Output: c1, c2 - Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" + Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" ORDER BY c2 ASC NULLS LAST (9 rows) explain (verbose, costs off) @@ -3885,24 +3885,21 @@ -- subquery using stable function (can't be sent to remote) PREPARE st2(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 > $1 AND date(c4) = '1970-01-17'::date) ORDER BY c1; EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st2(10, 20); - QUERY PLAN ----------------------------------------------------------------------------------------------------------- - Sort + QUERY PLAN +---------------------------------------------------------------------------------------------------------------------------------- + Nested Loop Semi Join Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8 - Sort Key: t1.c1 - -> Nested Loop Semi Join + Join Filter: (t1.c3 = t2.c3) + -> Foreign Scan on public.ft1 t1 Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8 - Join Filter: (t1.c3 = t2.c3) - -> Foreign Scan on public.ft1 t1 - Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8 - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20)) - -> Materialize + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" < 20)) ORDER BY "C 1" ASC NULLS LAST + -> Materialize + Output: t2.c3 + -> Foreign Scan on public.ft2 t2 Output: t2.c3 - -> Foreign Scan on public.ft2 t2 - Output: t2.c3 - Filter: (date(t2.c4) = '01-17-1970'::date) - Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" > 10)) -(15 rows) + Filter: (date(t2.c4) = '01-17-1970'::date) + Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" > 10)) +(12 rows) EXECUTE st2(10, 20); c1 | c2 | c3 | c4 | c5 | c6 | c7 | c8 @@ -9381,21 +9378,19 @@ -- test FOR UPDATE; partitionwise join does not apply EXPLAIN (COSTS OFF) SELECT t1.a, t2.b FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) WHERE t1.a % 25 = 0 ORDER BY 1,2 FOR UPDATE OF t1; - QUERY PLAN --------------------------------------------------------------- + QUERY PLAN +-------------------------------------------------------- LockRows - -> Sort - Sort Key: t1.a - -> Hash Join - Hash Cond: (t2.b = t1.a) + -> Nested Loop + Join Filter: (t1.a = t2.b) + -> Append + -> Foreign Scan on ftprt1_p1 t1_1 + -> Foreign Scan on ftprt1_p2 t1_2 + -> Materialize -> Append -> Foreign Scan on ftprt2_p1 t2_1 -> Foreign Scan on ftprt2_p2 t2_2 - -> Hash - -> Append - -> Foreign Scan on ftprt1_p1 t1_1 - -> Foreign Scan on ftprt1_p2 t1_2 -(12 rows) +(10 rows) SELECT t1.a, t2.b FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) WHERE t1.a % 25 = 0 ORDER BY 1,2 FOR UPDATE OF t1; a | b @@ -9430,18 +9425,16 @@ SET enable_partitionwise_aggregate TO false; EXPLAIN (COSTS OFF) SELECT a, sum(b), min(b), count(*) FROM pagg_tab GROUP BY a HAVING avg(b) < 22 ORDER BY 1; - QUERY PLAN ------------------------------------------------------------ - Sort - Sort Key: pagg_tab.a - -> HashAggregate - Group Key: pagg_tab.a - Filter: (avg(pagg_tab.b) < '22'::numeric) - -> Append - -> Foreign Scan on fpagg_tab_p1 pagg_tab_1 - -> Foreign Scan on fpagg_tab_p2 pagg_tab_2 - -> Foreign Scan on fpagg_tab_p3 pagg_tab_3 -(9 rows) + QUERY PLAN +----------------------------------------------------- + GroupAggregate + Group Key: pagg_tab.a + Filter: (avg(pagg_tab.b) < '22'::numeric) + -> Append + -> Foreign Scan on fpagg_tab_p1 pagg_tab_1 + -> Foreign Scan on fpagg_tab_p2 pagg_tab_2 + -> Foreign Scan on fpagg_tab_p3 pagg_tab_3 +(7 rows) -- Plan with partitionwise aggregates is enabled SET enable_partitionwise_aggregate TO true; @@ -9475,34 +9468,32 @@ -- Should have all the columns in the target list for the given relation EXPLAIN (VERBOSE, COSTS OFF) SELECT a, count(t1) FROM pagg_tab t1 GROUP BY a HAVING avg(b) < 22 ORDER BY 1; - QUERY PLAN ------------------------------------------------------------------------- - Sort - Output: t1.a, (count(((t1.*)::pagg_tab))) + QUERY PLAN +-------------------------------------------------------------------------------------------- + Merge Append Sort Key: t1.a - -> Append - -> HashAggregate - Output: t1.a, count(((t1.*)::pagg_tab)) - Group Key: t1.a - Filter: (avg(t1.b) < '22'::numeric) - -> Foreign Scan on public.fpagg_tab_p1 t1 - Output: t1.a, t1.*, t1.b - Remote SQL: SELECT a, b, c FROM public.pagg_tab_p1 - -> HashAggregate - Output: t1_1.a, count(((t1_1.*)::pagg_tab)) - Group Key: t1_1.a - Filter: (avg(t1_1.b) < '22'::numeric) - -> Foreign Scan on public.fpagg_tab_p2 t1_1 - Output: t1_1.a, t1_1.*, t1_1.b - Remote SQL: SELECT a, b, c FROM public.pagg_tab_p2 - -> HashAggregate - Output: t1_2.a, count(((t1_2.*)::pagg_tab)) - Group Key: t1_2.a - Filter: (avg(t1_2.b) < '22'::numeric) - -> Foreign Scan on public.fpagg_tab_p3 t1_2 - Output: t1_2.a, t1_2.*, t1_2.b - Remote SQL: SELECT a, b, c FROM public.pagg_tab_p3 -(25 rows) + -> GroupAggregate + Output: t1.a, count(((t1.*)::pagg_tab)) + Group Key: t1.a + Filter: (avg(t1.b) < '22'::numeric) + -> Foreign Scan on public.fpagg_tab_p1 t1 + Output: t1.a, t1.*, t1.b + Remote SQL: SELECT a, b, c FROM public.pagg_tab_p1 ORDER BY a ASC NULLS LAST + -> GroupAggregate + Output: t1_1.a, count(((t1_1.*)::pagg_tab)) + Group Key: t1_1.a + Filter: (avg(t1_1.b) < '22'::numeric) + -> Foreign Scan on public.fpagg_tab_p2 t1_1 + Output: t1_1.a, t1_1.*, t1_1.b + Remote SQL: SELECT a, b, c FROM public.pagg_tab_p2 ORDER BY a ASC NULLS LAST + -> GroupAggregate + Output: t1_2.a, count(((t1_2.*)::pagg_tab)) + Group Key: t1_2.a + Filter: (avg(t1_2.b) < '22'::numeric) + -> Foreign Scan on public.fpagg_tab_p3 t1_2 + Output: t1_2.a, t1_2.*, t1_2.b + Remote SQL: SELECT a, b, c FROM public.pagg_tab_p3 ORDER BY a ASC NULLS LAST +(23 rows) SELECT a, count(t1) FROM pagg_tab t1 GROUP BY a HAVING avg(b) < 22 ORDER BY 1; a | count @@ -9518,24 +9509,23 @@ -- When GROUP BY clause does not match with PARTITION KEY. EXPLAIN (COSTS OFF) SELECT b, avg(a), max(a), count(*) FROM pagg_tab GROUP BY b HAVING sum(a) < 700 ORDER BY 1; - QUERY PLAN ------------------------------------------------------------------ - Sort - Sort Key: pagg_tab.b - -> Finalize HashAggregate - Group Key: pagg_tab.b - Filter: (sum(pagg_tab.a) < 700) - -> Append - -> Partial HashAggregate - Group Key: pagg_tab.b - -> Foreign Scan on fpagg_tab_p1 pagg_tab - -> Partial HashAggregate - Group Key: pagg_tab_1.b - -> Foreign Scan on fpagg_tab_p2 pagg_tab_1 - -> Partial HashAggregate - Group Key: pagg_tab_2.b - -> Foreign Scan on fpagg_tab_p3 pagg_tab_2 -(15 rows) + QUERY PLAN +----------------------------------------------------------- + Finalize GroupAggregate + Group Key: pagg_tab.b + Filter: (sum(pagg_tab.a) < 700) + -> Merge Append + Sort Key: pagg_tab.b + -> Partial GroupAggregate + Group Key: pagg_tab.b + -> Foreign Scan on fpagg_tab_p1 pagg_tab + -> Partial GroupAggregate + Group Key: pagg_tab_1.b + -> Foreign Scan on fpagg_tab_p2 pagg_tab_1 + -> Partial GroupAggregate + Group Key: pagg_tab_2.b + -> Foreign Scan on fpagg_tab_p3 pagg_tab_2 +(14 rows) -- =================================================================== -- access rights and superuser