Re: [HACKERS] SQL/MED - file_fdw

Shigeru HANADA Thu, 13 Jan 2011 02:01:30 -0800

On Fri, 7 Jan 2011 10:57:17 +0900
Itagaki Takahiro <itagaki.takah...@gmail.com> wrote:
> On Mon, Dec 20, 2010 at 20:42, Itagaki Takahiro
> <itagaki.takah...@gmail.com> wrote:
> > I added comments and moved some setup codes for COPY TO to BeginCopyTo()
> > for maintainability. CopyTo() still contains parts of initialization,
> > but I've not touched it yet because we don't need the arrangement for now.
> 
> I updated the COPY FROM API patch.
> - GetCopyExecutorState() is removed because FDWs will use their own context.


I rebased file_fdw patch to recent copy_export patch, and have some
comments.

> The patch just rearranges codes for COPY FROM to export those functions.
> It also modifies some of COPY TO codes internally for code readability.
> - BeginCopyFrom(rel, filename, attnamelist, options)
> - EndCopyFrom(cstate)
> - NextCopyFrom(cstate, OUT values, OUT nulls, OUT tupleOid)
> - CopyFromErrorCallback(arg)

This API set seems to be enough to implement file_fdw using COPY
routines.

But EndCopyFrom() seems not to be able to release memory which is
allocated in BeginCopy() and BeginCopyFrom().  I found this behavior
by executing a query which generates nested loop plan (outer 1000000
row * inner 10 row), and at last postgres grows up to 300MB+ from
108MB (VIRT of top command). 

Attached patch would avoid this leak by adding per-copy context to
CopyState.  This would be overkill, and ResetCopyFrom() might be
reasonable though.

Anyway, I couldn't find performance degrade with this patch (tested on
my Linux box).

==============
# csv_accounts and csv_branches are generated by:
1) pgbench -i -s 10
2) COPY pgbench_accounts to '/path/to/accounts.csv' WITH CSV;
3) COPY pgbench_branches to '/path/to/branches.csv' WITH CSV;

<Original (There is no memory swap during measurement) >
postgres=# explain analyze select * from csv_accounts b, csv_branches t where 
t.bid = b.bid;
                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..11717.01 rows=1 width=200) (actual 
time=0.300..100833.057 rows=1000000 loops=1)
   Join Filter: (b.bid = t.bid)
   ->  Foreign Scan on csv_accounts b  (cost=0.00..11717.00 rows=1 width=100) 
(actual time=0.148..4437.595 rows=1000000 loops=1)
   ->  Foreign Scan on csv_branches t  (cost=0.00..0.00 rows=1 width=100) 
(actual time=0.014..0.039 rows=10 loops=1000000)
 Total runtime: 102882.308 ms
(5 rows)

<Patched, Using per-copy context to release memory>
postgres=# explain analyze select * from csv_accounts b, csv_branches t where 
t.bid = b.bid;
                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..11717.01 rows=1 width=200) (actual 
time=0.226..100931.864 rows=1000000 loops=1)
   Join Filter: (b.bid = t.bid)
   ->  Foreign Scan on csv_accounts b  (cost=0.00..11717.00 rows=1 width=100) 
(actual time=0.085..4439.777 rows=1000000 loops=1)
   ->  Foreign Scan on csv_branches t  (cost=0.00..0.00 rows=1 width=100) 
(actual time=0.015..0.039 rows=10 loops=1000000)
 Total runtime: 102684.276 ms
(5 rows)
==============

This memory leak would not be problem when using from COPY command
because it handles only one CopyState in a query, and it will be
cleaned up with parent context.

> Some items to be considered:
> - BeginCopyFrom() could receive filename as an option instead of a separated
> argument. If do so, file_fdw would be more simple, but it's a change only for
> file_fdw. COPY commands in the core won't be improved at all.

ISTM that current design would be better.

> - NextCopyFrom() returns values/nulls arrays rather than a HeapTuple. I expect
> the caller store the result into tupletableslot with ExecStoreVirtualTuple().
> It is designed for performance, but if the caller always needs an materialized
> HeapTuple, HeapTuple is better for the result type.

I tried to add tableoid to TupleTableSlot as tts_tableoid, but it
seems to make codes such as slot_getaddr() and other staff tricky.

How about to implement using materialized tuples to avoid unnecessary
(at least for functionality) changes.  I would like to send this
virtual-tuple-optimization to next development cycle because it would
not effect the interface heavily.  I'll post materialized-tuple
version of foreign_scan patch soon.

Regards,
--
Shigeru Hanada

20110113-copy_context.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/MED - file_fdw

Reply via email to