On Mon, 10 Jan 2011 19:26:11 -0500 Tom Lane <t...@sss.pgh.pa.us> wrote: > Shigeru HANADA <han...@metrosystems.co.jp> writes: > > For the purpose of file_fdw, additional ResetCopyFrom() would be > > necessary. I'm planning to include such changes in file_fdw patch. > > Please find attached partial patch for ResetCopyFrom(). Is there > > anything else which should be done at reset? > > Seems like it would be smarter to close and re-open the copy operation. > Adding a reset function is just creating an additional maintenance > burden and point of failure, for what seems likely to be a negligible > performance benefit.
Agreed. fileReScan can be implemented with close/re-open with storing some additional information into FDW private area. I would withdraw the proposal. > If you think it's not negligible, please show some proof of that before > asking us to support such code. Anyway, I've measured overhead of re-open with executing query including inner join between foreign tables copied from pgbench schema. I used SELECT statement below: EXPLAIN (ANALYZE) SELECT count(*) FROM csv_accounts a JOIN csv_branches b ON (b.bid = a.bid); On the average of (Nested Loop - (Foreign Scan * 2)), overhead of re-open is round 0.048ms per tuple (average of 3 times measurement). After the implementation of file_fdw, I'm going to measure again. If ResetCopyFrom significantly improves performance of ReScan, I'll propose it as a separate patch. ========================================================================= The results of EXPLAIN ANALYZE are: [using ResetCopyFrom] QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=11717.02..11717.03 rows=1 width=0) (actual time=73357.655..73357.657 rows=1 loops=1) -> Nested Loop (cost=0.00..11717.01 rows=1 width=0) (actual time=0.209..71424.059 rows=1000000 loops=1) -> Foreign Scan on public.csv_accounts a (cost=0.00..11717.00 rows=1 width=4) (actual time=0.144..6998.497 rows=1000000 loops=1) -> Foreign Scan on public.csv_branches b (cost=0.00..0.00 rows=1 width=4) (actual time=0.008..0.037 rows=10 loops=1000000) Total runtime: 73358.135 ms (11 rows) [using EndCopyFrom + BeginCopyFrom] QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=11717.02..11717.03 rows=1 width=0) (actual time=120724.138..120724.140 rows=1 loops=1) -> Nested Loop (cost=0.00..11717.01 rows=1 width=0) (actual time=0.321..118583.681 rows=1000000 loops=1) -> Foreign Scan on public.csv_accounts a (cost=0.00..11717.00 rows=1 width=4) (actual time=0.156..7208.968 rows=1000000 loops=1) -> Foreign Scan on public.csv_branches b (cost=0.00..0.00 rows=1 width=4) (actual time=0.016..0.046 rows=10 loops=1000000) Total runtime: 121118.792 ms (11 rows) Time: 121122.205 ms ========================================================================= Regards, -- Shigeru Hanada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers