Hello, Sorry for my late response. The attached patch reflects your comments.
> Here are few comments on latest patch: > > > 1. > make/make check is fine, however I am getting regression failure in > postgres_fdw contrib module (attached regression.diff). > Please investigate and fix. > It was an incorrect interaction when postgres_fdw tries to push down sorting to the remote side. We cannot attach LIMIT clause on the plain scan path across SORT, however, the previous version estimated the cost for the plain scan with LIMIT clause even if local sorting is needed. If remote scan may return just 10 rows, estimated cost of the local sort is very lightweight, thus, this unreasonable path was chosen. (On the other hands, its query execution results were correct because ps_numTuples is not delivered across Sort node, so ForeignScan eventually scanned all the remote tuples. It made correct results but not optimal from the viewpoint of performance.) The v3 patch estimates the cost with remote LIMIT clause only if supplied pathkey strictly matches with the final output order of the query, thus, no local sorting is expected. Some of the regression test cases still have different plans but due to the new optimization by remote LIMIT clause. Without remote LIMIT clause, some of regression test cases preferred remote-JOIN + local-SORT then local-LIMIT. Once we have remote-LIMIT option, it allows to discount the cost for remote-SORT by choice of top-k heap sorting. It changed the optimizer's decision on some test cases. Potential one big change is the test case below. -- CROSS JOIN, not pushed down EXPLAIN (VERBOSE, COSTS OFF) SELECT t1.c1, t2.c1 FROM ft1 t1 CROSS JOIN ft2 t2 ORDER BY t1.c1, t2.c1 OFFSET 100 LIMIT 10; It assumed CROSS JOIN was not pushed down due to the cost for network traffic, however, remote LIMIT reduced the estimated number of tuples to be moved. So, all of the CROSS JOIN + ORDER BY + LIMIT became to run on the remote side. > 2. > + * > + * MEMO: root->limit_tuples is not attached when query > contains > + * grouping-clause or aggregate functions. So, we don's adjust > + * rows even if LIMIT <const> is supplied. > > Can you please explain why you are not doing this for grouping-clause or > aggregate functions. > See grouping_planner() at optimizer/plan/planner.c It puts an invalid value on the root->limit_tuples if query has GROUP BY clause, so we cannot know number of tuples to be returned even if we have upper limit actually. /* * Figure out whether there's a hard limit on the number of rows that * query_planner's result subplan needs to return. Even if we know a * hard limit overall, it doesn't apply if the query has any * grouping/aggregation operations, or SRFs in the tlist. */ if (parse->groupClause || parse->groupingSets || parse->distinctClause || parse->hasAggs || parse->hasWindowFuncs || parse->hasTargetSRFs || root->hasHavingQual) root->limit_tuples = -1.0; else root->limit_tuples = limit_tuples; > 3. > Typo: > > don's => don't > Fixed, best regards, ---- PG-Strom Project / NEC OSS Promotion Center KaiGai Kohei <kai...@ak.jp.nec.com>
passdown-limit-fdw.v3.patch
Description: passdown-limit-fdw.v3.patch
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers