On 29.04.2017 00:13, Douglas Doole wrote:
If you add this to the commitfest app, more people might look at
it when the next commitfest opens.
I have added it. https://commitfest.postgresql.org/14/1119/
Also, it might help if you can provide a query/ies with numbers
where this optimization shows improvement.
I can't provide the real queries where we encountered the problem
because they are internal. However I showed a simplified version of
the queries in my first post.
On our queries, the change made quite a difference - execution time
dropped from 31.4 seconds to 7.2 seconds. Explain analyze also shows
that memory use dropped significantly and we didn't have to spill the
sort to disk
From:
-> Sort (cost=989.95..1013.27 rows=9326 width=30)
(node_startup_time/loop=31328.891, node_total_time/loop: 31329.756
rows=2001 loops=1) Buffers: temp read=772 written=11201 lsm_bufmgr
hits=3392 Sort Key: *** Sort Method: external merge Sort Space Used:
89592 Sort Space Type: Disk
To:
-> Sort (cost=989.95..1013.27 rows=9326 width=30)
(node_startup_time/loop=7123.275, node_total_time/loop: 7123.504
rows=2001 loops=1) Buffers: lsm_bufmgr hits=3387 Sort Key: *** Sort
Method: top-N heapsort Sort Space Used: 3256 Sort Space Type: Memory
Attached please find yet another small patch which pushes down LIMIT to
ForeignScan.
I should notice that currently Postgres optimizer is using "Merge
Append" and fetches from remote nodes only required number of tuples.
So even without LIMIT push down, postgres_fdw will not pull the whole
table from remote host.
postgres_fdw is using cursor for fetching data from remote. Default
fetch size is 100, so even without limit remote query will fetch no
more than 100 rows at remote site.
Assume the following example:
postgres=# create extension postgres_fdw;
postgres=# create server shard1 FOREIGN DATA WRAPPER postgres_fdw
options(dbname 'postgres', host 'localhost', port '5432');
postgres=# create server shard2 FOREIGN DATA WRAPPER postgres_fdw
options(dbname 'postgres', host 'localhost', port '5432');
postgres=# CREATE USER MAPPING for $user SERVER shard1 options (user
'$user');
postgres=# CREATE USER MAPPING for $user SERVER shard1 options (user
'$user');
postgres=# CREATE TABLE t(u integer primary key, v integer);
postgres=# CREATE TABLE t1(u integer primary key, v integer);
postgres=# CREATE TABLE t2(u integer primary key, v integer);
postgres=# insert into t1 values (generate_series(1,100000),
random()*100000);
postgres=# insert into t2 values (generate_series(1,100000),
random()*100000);
postgres=# CREATE FOREIGN TABLE t_fdw1() inherits (t) server shard1
options(table_name 't1');
postgres=# CREATE FOREIGN TABLE t_fdw2() inherits (t) server shard2
options(table_name 't2');
postgres=# explain analyze select * from t order by u limit 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Limit (cost=200.15..200.20 rows=1 width=8) (actual time=2.010..2.010
rows=1 loops=1)
-> Merge Append (cost=200.15..449.39 rows=5121 width=8) (actual
time=2.009..2.009 rows=1 loops=1)
Sort Key: t.u
-> Index Scan using t_pkey on t (cost=0.12..8.14 rows=1
width=8) (actual time=0.005..0.005 rows=0 loops=1)
-> Foreign Scan on t_fdw2 (cost=100.00..193.92 rows=2560
width=8) (actual time=1.074..1.074rows=1 loops=1)
-> Foreign Scan on t_fdw1 (cost=100.00..193.92 rows=2560
width=8) (actual time=0.928..0.928rows=1 loops=1)
Planning time: 0.769 ms
Execution time: 6.837 ms
(8 rows)
As you can see foreign scan fetches only one row from each remote node.
But still pushing down limit can have positive effect on performance,
especially if SORT can be replaced with TOP-N.
I got the following results (time in seconds):
Query
original
limit push down
select * from t order by u limit 1
2.276
1.777
select * from t order by v limit 1
100 42
There is index for "u", so fetching records with smallest "u" values can
be done without sorting, so times are similar.
But in case of sorting by "v", pushing down limit allows to use TOP-1
instead of global sort and it reduces query execution time more than 2
times.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 080cb0a..e3847ce 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2949,7 +2949,8 @@ create_cursor(ForeignScanState *node)
initStringInfo(&buf);
appendStringInfo(&buf, "DECLARE c%u CURSOR FOR\n%s",
fsstate->cursor_number, fsstate->query);
-
+ if (node->limit > 0)
+ appendStringInfo(&buf, " LIMIT %lld", (long long)node->limit);
/*
* Notice that we pass NULL for paramTypes, thus forcing the remote server
* to infer types for all parameters. Since we explicitly cast every
diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
index aaec132..675beb5 100644
--- a/src/backend/executor/nodeLimit.c
+++ b/src/backend/executor/nodeLimit.c
@@ -316,10 +316,10 @@ recompute_limits(LimitState *node)
static void
pass_down_bound(LimitState *node, PlanState *child_node)
{
+ int64 tuples_needed = node->count + node->offset;
if (IsA(child_node, SortState))
{
SortState *sortState = (SortState *) child_node;
- int64 tuples_needed = node->count + node->offset;
/* negative test checks for overflow in sum */
if (node->noCount || tuples_needed < 0)
@@ -341,6 +341,11 @@ pass_down_bound(LimitState *node, PlanState *child_node)
for (i = 0; i < maState->ms_nplans; i++)
pass_down_bound(node, maState->mergeplans[i]);
}
+ else if (IsA(child_node, ForeignScanState))
+ {
+ ForeignScanState *fsState = (ForeignScanState *) child_node;
+ fsState->limit = tuples_needed;
+ }
else if (IsA(child_node, ResultState))
{
/*
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index d33392f..d706d87 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -1502,6 +1502,7 @@ typedef struct ForeignScanState
ScanState ss; /* its first field is NodeTag */
ExprState *fdw_recheck_quals; /* original quals not in ss.ps.qual */
Size pscan_len; /* size of parallel coordination information */
+ int64 limit; /* pushed down limit */
/* use struct pointer to avoid including fdwapi.h here */
struct FdwRoutine *fdwroutine;
void *fdw_state; /* foreign-data wrapper can keep state here */
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers