Why is DEFAULT_FDW_TUPLE_COST so insanely low?

David Rowley Tue, 02 Aug 2022 07:56:46 -0700

Over on [1] I was complaining that I thought DEFAULT_FDW_TUPLE_COST,
which is defined as 0.01 was unrealistically low.

For comparison, cpu_tuple_cost, something we probably expect to be in
a CPU cache is also 0.01. We've defined DEFAULT_PARALLEL_TUPLE_COST
to be 0.1, which is 10x cpu_tuple_cost. That's coming from a shared
memory segment. So why do we think DEFAULT_FDW_TUPLE_COST should be
the same as cpu_tuple_cost when that's probably pulling a tuple from
some remote server over some (possibly slow) network?

I did a little experiment in the attached .sql file and did some maths
to try to figure out what it's really likely to be costing us. I tried
this with and without the attached hack to have the planner not
consider remote grouping just to see how much slower pulling a million
tuples through the FDW would cost.

I setup a loopback server on localhost (which has about the lowest
possible network latency) and found the patched query to the foreign
server took:

Execution Time: 530.000 ms

This is pulling all million tuples over and doing the aggregate locally.

Unpatched, the query took:

Execution Time: 35.334 ms

so about 15x faster.

If I take the seqscan cost for querying the local table, which is
14425.00 multiply that by 15 (the extra time it took to pull the 1
million tuples) then divide by 1 million to get the extra cost per
tuple, then that comes to about 0.216. So that says
DEFAULT_FDW_TUPLE_COST is about 21x lower than it should be.

I tried cranking DEFAULT_FDW_TUPLE_COST up to 0.5 to see what plans
would change in the postgres_fdw regression tests and quite a number
changed. Many seem to be pushing the sorts down to the remote server
where they were being done locally before. A few others just seem
weird. For example, the first one seems to be blindly adding a remote
sort when it does no good. I think it would take quite a bit of study
with a debugger to figure out what's going on with many of these.

Does anyone have any ideas why DEFAULT_FDW_TUPLE_COST was set so low?

Does anyone object to it being set to something more realistic?

David

[1]
https://www.postgresql.org/message-id/CAApHDvpXiXLxg4TsA8P_4etnuGQqAAbHWEOM4hGe=dcaxmi...@mail.gmail.com

diff --git a/src/backend/optimizer/plan/planner.c 
b/src/backend/optimizer/plan/planner.c
index 64632db73c..b4e3b91d7f 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -3921,7 +3921,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, 
RelOptInfo *input_rel,
         * If there is an FDW that's responsible for all baserels of the query,
         * let it consider adding ForeignPaths.
         */
-       if (grouped_rel->fdwroutine &&
+       if (0 && grouped_rel->fdwroutine &&
                grouped_rel->fdwroutine->GetForeignUpperPaths)
                grouped_rel->fdwroutine->GetForeignUpperPaths(root, 
UPPERREL_GROUP_AGG,
                                                                                
                          input_rel, grouped_rel,

fdw_experiment.sql
Description: Binary data

--- "expected\\postgres_fdw.out"        2022-08-03 01:34:42.806967000 +1200
+++ "results\\postgres_fdw.out" 2022-08-03 02:33:40.719712900 +1200
@@ -2164,8 +2164,8 @@
 -- unsafe conditions on one side (c8 has a UDT), not pushed down.
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT t1.c1, t2.c1 FROM ft1 t1 LEFT JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE 
t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
-                                 QUERY PLAN                                  
------------------------------------------------------------------------------
+                                                          QUERY PLAN           
                                               
+------------------------------------------------------------------------------------------------------------------------------
  Limit
    Output: t1.c1, t2.c1, t1.c3
    ->  Sort
@@ -2182,7 +2182,7 @@
                      ->  Foreign Scan on public.ft1 t1
                            Output: t1.c1, t1.c3
                            Filter: (t1.c8 = 'foo'::user_enum)
-                           Remote SQL: SELECT "C 1", c3, c8 FROM "S 1"."T 1"
+                           Remote SQL: SELECT "C 1", c3, c8 FROM "S 1"."T 1" 
ORDER BY c3 ASC NULLS LAST, "C 1" ASC NULLS LAST
 (17 rows)
 
 SELECT t1.c1, t2.c1 FROM ft1 t1 LEFT JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE 
t1.c8 = 'foo' ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10;
@@ -2873,13 +2873,13 @@
  Sort
    Output: (sum(c1)), c2
    Sort Key: (sum(ft1.c1))
-   ->  HashAggregate
+   ->  GroupAggregate
          Output: sum(c1), c2
          Group Key: ft1.c2
          Filter: (avg((ft1.c1 * ((random() <= '1'::double 
precision))::integer)) > '100'::numeric)
          ->  Foreign Scan on public.ft1
                Output: c1, c2
-               Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" ORDER BY c2 ASC 
NULLS LAST
 (10 rows)
 
 -- Remote aggregate in combination with a local Param (for the output
@@ -3123,12 +3123,12 @@
  Sort
    Output: (sum(c1) FILTER (WHERE ((((c1 / c1))::double precision * random()) 
<= '1'::double precision))), c2
    Sort Key: (sum(ft1.c1) FILTER (WHERE ((((ft1.c1 / ft1.c1))::double 
precision * random()) <= '1'::double precision)))
-   ->  HashAggregate
+   ->  GroupAggregate
          Output: sum(c1) FILTER (WHERE ((((c1 / c1))::double precision * 
random()) <= '1'::double precision)), c2
          Group Key: ft1.c2
          ->  Foreign Scan on public.ft1
                Output: c1, c2
-               Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1"
+               Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" ORDER BY c2 ASC 
NULLS LAST
 (9 rows)
 
 explain (verbose, costs off)
@@ -3885,24 +3885,21 @@
 -- subquery using stable function (can't be sent to remote)
 PREPARE st2(int) AS SELECT * FROM ft1 t1 WHERE t1.c1 < $2 AND t1.c3 IN (SELECT 
c3 FROM ft2 t2 WHERE c1 > $1 AND date(c4) = '1970-01-17'::date) ORDER BY c1;
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st2(10, 20);
-                                                QUERY PLAN                     
                           
-----------------------------------------------------------------------------------------------------------
- Sort
+                                                            QUERY PLAN         
                                                   
+----------------------------------------------------------------------------------------------------------------------------------
+ Nested Loop Semi Join
    Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-   Sort Key: t1.c1
-   ->  Nested Loop Semi Join
+   Join Filter: (t1.c3 = t2.c3)
+   ->  Foreign Scan on public.ft1 t1
          Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-         Join Filter: (t1.c3 = t2.c3)
-         ->  Foreign Scan on public.ft1 t1
-               Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
-               Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 
1"."T 1" WHERE (("C 1" < 20))
-         ->  Materialize
+         Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" 
WHERE (("C 1" < 20)) ORDER BY "C 1" ASC NULLS LAST
+   ->  Materialize
+         Output: t2.c3
+         ->  Foreign Scan on public.ft2 t2
                Output: t2.c3
-               ->  Foreign Scan on public.ft2 t2
-                     Output: t2.c3
-                     Filter: (date(t2.c4) = '01-17-1970'::date)
-                     Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" 
> 10))
-(15 rows)
+               Filter: (date(t2.c4) = '01-17-1970'::date)
+               Remote SQL: SELECT c3, c4 FROM "S 1"."T 1" WHERE (("C 1" > 10))
+(12 rows)
 
 EXECUTE st2(10, 20);
  c1 | c2 |  c3   |              c4              |            c5            | 
c6 |     c7     | c8  
@@ -9381,21 +9378,19 @@
 -- test FOR UPDATE; partitionwise join does not apply
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t2.b FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) WHERE 
t1.a % 25 = 0 ORDER BY 1,2 FOR UPDATE OF t1;
-                          QUERY PLAN                          
---------------------------------------------------------------
+                       QUERY PLAN                       
+--------------------------------------------------------
  LockRows
-   ->  Sort
-         Sort Key: t1.a
-         ->  Hash Join
-               Hash Cond: (t2.b = t1.a)
+   ->  Nested Loop
+         Join Filter: (t1.a = t2.b)
+         ->  Append
+               ->  Foreign Scan on ftprt1_p1 t1_1
+               ->  Foreign Scan on ftprt1_p2 t1_2
+         ->  Materialize
                ->  Append
                      ->  Foreign Scan on ftprt2_p1 t2_1
                      ->  Foreign Scan on ftprt2_p2 t2_2
-               ->  Hash
-                     ->  Append
-                           ->  Foreign Scan on ftprt1_p1 t1_1
-                           ->  Foreign Scan on ftprt1_p2 t1_2
-(12 rows)
+(10 rows)
 
 SELECT t1.a, t2.b FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) WHERE 
t1.a % 25 = 0 ORDER BY 1,2 FOR UPDATE OF t1;
   a  |  b  
@@ -9430,18 +9425,16 @@
 SET enable_partitionwise_aggregate TO false;
 EXPLAIN (COSTS OFF)
 SELECT a, sum(b), min(b), count(*) FROM pagg_tab GROUP BY a HAVING avg(b) < 22 
ORDER BY 1;
-                        QUERY PLAN                         
------------------------------------------------------------
- Sort
-   Sort Key: pagg_tab.a
-   ->  HashAggregate
-         Group Key: pagg_tab.a
-         Filter: (avg(pagg_tab.b) < '22'::numeric)
-         ->  Append
-               ->  Foreign Scan on fpagg_tab_p1 pagg_tab_1
-               ->  Foreign Scan on fpagg_tab_p2 pagg_tab_2
-               ->  Foreign Scan on fpagg_tab_p3 pagg_tab_3
-(9 rows)
+                     QUERY PLAN                      
+-----------------------------------------------------
+ GroupAggregate
+   Group Key: pagg_tab.a
+   Filter: (avg(pagg_tab.b) < '22'::numeric)
+   ->  Append
+         ->  Foreign Scan on fpagg_tab_p1 pagg_tab_1
+         ->  Foreign Scan on fpagg_tab_p2 pagg_tab_2
+         ->  Foreign Scan on fpagg_tab_p3 pagg_tab_3
+(7 rows)
 
 -- Plan with partitionwise aggregates is enabled
 SET enable_partitionwise_aggregate TO true;
@@ -9475,34 +9468,32 @@
 -- Should have all the columns in the target list for the given relation
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT a, count(t1) FROM pagg_tab t1 GROUP BY a HAVING avg(b) < 22 ORDER BY 1;
-                               QUERY PLAN                               
-------------------------------------------------------------------------
- Sort
-   Output: t1.a, (count(((t1.*)::pagg_tab)))
+                                         QUERY PLAN                            
             
+--------------------------------------------------------------------------------------------
+ Merge Append
    Sort Key: t1.a
-   ->  Append
-         ->  HashAggregate
-               Output: t1.a, count(((t1.*)::pagg_tab))
-               Group Key: t1.a
-               Filter: (avg(t1.b) < '22'::numeric)
-               ->  Foreign Scan on public.fpagg_tab_p1 t1
-                     Output: t1.a, t1.*, t1.b
-                     Remote SQL: SELECT a, b, c FROM public.pagg_tab_p1
-         ->  HashAggregate
-               Output: t1_1.a, count(((t1_1.*)::pagg_tab))
-               Group Key: t1_1.a
-               Filter: (avg(t1_1.b) < '22'::numeric)
-               ->  Foreign Scan on public.fpagg_tab_p2 t1_1
-                     Output: t1_1.a, t1_1.*, t1_1.b
-                     Remote SQL: SELECT a, b, c FROM public.pagg_tab_p2
-         ->  HashAggregate
-               Output: t1_2.a, count(((t1_2.*)::pagg_tab))
-               Group Key: t1_2.a
-               Filter: (avg(t1_2.b) < '22'::numeric)
-               ->  Foreign Scan on public.fpagg_tab_p3 t1_2
-                     Output: t1_2.a, t1_2.*, t1_2.b
-                     Remote SQL: SELECT a, b, c FROM public.pagg_tab_p3
-(25 rows)
+   ->  GroupAggregate
+         Output: t1.a, count(((t1.*)::pagg_tab))
+         Group Key: t1.a
+         Filter: (avg(t1.b) < '22'::numeric)
+         ->  Foreign Scan on public.fpagg_tab_p1 t1
+               Output: t1.a, t1.*, t1.b
+               Remote SQL: SELECT a, b, c FROM public.pagg_tab_p1 ORDER BY a 
ASC NULLS LAST
+   ->  GroupAggregate
+         Output: t1_1.a, count(((t1_1.*)::pagg_tab))
+         Group Key: t1_1.a
+         Filter: (avg(t1_1.b) < '22'::numeric)
+         ->  Foreign Scan on public.fpagg_tab_p2 t1_1
+               Output: t1_1.a, t1_1.*, t1_1.b
+               Remote SQL: SELECT a, b, c FROM public.pagg_tab_p2 ORDER BY a 
ASC NULLS LAST
+   ->  GroupAggregate
+         Output: t1_2.a, count(((t1_2.*)::pagg_tab))
+         Group Key: t1_2.a
+         Filter: (avg(t1_2.b) < '22'::numeric)
+         ->  Foreign Scan on public.fpagg_tab_p3 t1_2
+               Output: t1_2.a, t1_2.*, t1_2.b
+               Remote SQL: SELECT a, b, c FROM public.pagg_tab_p3 ORDER BY a 
ASC NULLS LAST
+(23 rows)
 
 SELECT a, count(t1) FROM pagg_tab t1 GROUP BY a HAVING avg(b) < 22 ORDER BY 1;
  a  | count 
@@ -9518,24 +9509,23 @@
 -- When GROUP BY clause does not match with PARTITION KEY.
 EXPLAIN (COSTS OFF)
 SELECT b, avg(a), max(a), count(*) FROM pagg_tab GROUP BY b HAVING sum(a) < 
700 ORDER BY 1;
-                           QUERY PLAN                            
------------------------------------------------------------------
- Sort
-   Sort Key: pagg_tab.b
-   ->  Finalize HashAggregate
-         Group Key: pagg_tab.b
-         Filter: (sum(pagg_tab.a) < 700)
-         ->  Append
-               ->  Partial HashAggregate
-                     Group Key: pagg_tab.b
-                     ->  Foreign Scan on fpagg_tab_p1 pagg_tab
-               ->  Partial HashAggregate
-                     Group Key: pagg_tab_1.b
-                     ->  Foreign Scan on fpagg_tab_p2 pagg_tab_1
-               ->  Partial HashAggregate
-                     Group Key: pagg_tab_2.b
-                     ->  Foreign Scan on fpagg_tab_p3 pagg_tab_2
-(15 rows)
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Finalize GroupAggregate
+   Group Key: pagg_tab.b
+   Filter: (sum(pagg_tab.a) < 700)
+   ->  Merge Append
+         Sort Key: pagg_tab.b
+         ->  Partial GroupAggregate
+               Group Key: pagg_tab.b
+               ->  Foreign Scan on fpagg_tab_p1 pagg_tab
+         ->  Partial GroupAggregate
+               Group Key: pagg_tab_1.b
+               ->  Foreign Scan on fpagg_tab_p2 pagg_tab_1
+         ->  Partial GroupAggregate
+               Group Key: pagg_tab_2.b
+               ->  Foreign Scan on fpagg_tab_p3 pagg_tab_2
+(14 rows)
 
 -- ===================================================================
 -- access rights and superuser

Why is DEFAULT_FDW_TUPLE_COST so insanely low?

Reply via email to