Yes, I did not consider that to be a problem because I did not think it
would be used on indexed tables. I figured that the gain from doing bulk
inserts into the table would be so diluted by the still-bottle-necked
index maintenance that it was OK not to use this optimization for
indexed tables.
I've tested with indexes, and the index update time is much larger than
the inserts time. Bulk inserts still provide a little bonus though, and
having a solution that works in all cases is better IMHO.
My original thought was based on the idea of still using heap_insert, but
with a modified form of bistate which would hold the exclusive lock and
not
just a pin. If heap_insert is being driven by the unmodified COPY code,
then it can't guarantee that COPY won't stall on a pipe read or
something,
and so probably shouldn't hold an exclusive lock while filling the block.
Exactly, that's what I was thinking too, and reached the same conclusion.
That is why I decided a local buffer would be better, as the exclusive
lock
is really a no-op and wouldn't block anyone. But if you are creating a
new
heap_bulk_insert and modifying the COPY to go with it, then you can
guarantee it won't stall from the driving end, instead.
I think it's better, but you have to buffer tuples : at least a full
page's worth, or better, several pages' worth of tuples, in case inline
compression kicks in and shrinks them, since the purpose is to be able to
fill a complete page in one go.
Whether any of these approaches will be maintainable enough to be
integrated into the code base is another matter. It seems like there is
already a lot of discussion going on around various permutations of copy
options.
It's not really a COPY mod, since it would also be good for big INSERT
INTO SELECT FROM which is wal-bound too (even more so than COPY, since
there is no parsing to do).
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers