On 2019-04-03 18:44:32 +1300, David Rowley wrote: > (Fixed all of what you mentioned) > > On Wed, 3 Apr 2019 at 06:57, Andres Freund <and...@anarazel.de> wrote: > > > +/* > > > + * CopyMultiInsertInfo_Flush > > > + * Write out all stored tuples in all buffers out to the > > > tables. > > > + * > > > + * To save us from ending up with buffers for 1000s of partitions we > > > remove > > > + * buffers belonging to partitions that we've seen no tuples for in this > > > batch > > > > That seems a little naive (imagine you have like 5 partitions, and we > > constantly cycle through 2-3 of them per batch). It's probably OK for > > this version. I guess I'd only do that cleanup if we're actually > > flushing due to the number of partitions. > > hmm good point. It seems like being smarter there would be a good thing. > > I've ended up getting rid of the hash table in favour of the List that > you mentioned and storing the buffer in ResultRelInfo.
Cool. > I also changed > the logic that removes buffers once we reach the limit. Instead of > getting rid of buffers that were not used on this run, I've changed it > so it just gets rid of the buffers starting with the oldest one first, > but stops once the number of buffers is at the maximum again. This > can mean that we end up with MAX_BUFFERED_TUPLES buffers instead of > MAX_PARTITION_BUFFERS if there is only 1 tuple per buffer. My current > thinking is that this does not matter since only 1 slot will be > allocated per buffer. We'll remove all of the excess buffers during > the flush and keep just MAX_PARTITION_BUFFERS of the newest buffers. Yea, that makes sense to me. > Also, after changing CopyMultiInsertBuffer to use fixed sized arrays > instead of allocating them with another palloc the performance has > improved a bit more. > > Using the perl files mentioned in [1] > > Master + Patched: > # copy listp from program $$perl ~/bench_same.pl$$ delimiter '|'; > COPY 35651564 > Time: 9106.776 ms (00:09.107) > # truncate table listp; > TRUNCATE TABLE > # copy listp from program $$perl ~/bench.pl$$ delimiter '|'; > COPY 35651564 > Time: 10154.196 ms (00:10.154) > > > Master only: > # copy listp from program $$perl ~/bench_same.pl$$ delimiter '|'; > COPY 35651564 > Time: 22200.535 ms (00:22.201) > # truncate table listp; > TRUNCATE TABLE > # copy listp from program $$perl ~/bench.pl$$ delimiter '|'; > COPY 35651564 > Time: 18592.107 ms (00:18.592) Awesome. I'm probably too tired to give you meaningful feedback tonight. But I'll do a quick readthrough. > +/* Class the buffer full if there are >= this many bytes of tuples stored */ > +#define MAX_BUFFERED_BYTES 65535 Random aside: This seems pretty small (but should be changed separately. > +/* Trim the list of buffers back down to this number after flushing */ > +#define MAX_PARTITION_BUFFERS 32 > + > +/* Stores multi-insert data related to a single relation in CopyFrom. */ > +typedef struct CopyMultiInsertBuffer > +{ > + TupleTableSlot *slots[MAX_BUFFERED_TUPLES]; /* Array to store tuples */ > + ResultRelInfo *resultRelInfo; /* ResultRelInfo for 'relid' */ > + BulkInsertState bistate; /* BulkInsertState for this rel */ > + int nused; /* number of 'slots' > containing tuples */ > + uint64 linenos[MAX_BUFFERED_TUPLES]; /* Line # of tuple in > copy > + > * stream */ > +} CopyMultiInsertBuffer; I don't think it's needed now, but you'd probably achieve a bit better locality by storing slots + linenos in a single array of (slot,lineno). > +/* > + * CopyMultiInsertBuffer_Init > + * Allocate memory and initialize a new CopyMultiInsertBuffer for > this > + * ResultRelInfo. > + */ > +static CopyMultiInsertBuffer * > +CopyMultiInsertBuffer_Init(ResultRelInfo *rri) > +{ > + CopyMultiInsertBuffer *buffer; > + > + buffer = (CopyMultiInsertBuffer *) > palloc(sizeof(CopyMultiInsertBuffer)); > + memset(buffer->slots, 0, sizeof(TupleTableSlot *) * > MAX_BUFFERED_TUPLES); Is there a reason to not just directly palloc0? > +/* > + * CopyMultiInsertBuffer_Cleanup > + * Drop used slots and free member for this buffer. The buffer > + * must be flushed before cleanup. > + */ > +static inline void > +CopyMultiInsertBuffer_Cleanup(CopyMultiInsertBuffer *buffer) > +{ > + int i; > + > + /* Ensure buffer was flushed */ > + Assert(buffer->nused == 0); > + > + /* Remove back-link to ourself */ > + buffer->resultRelInfo->ri_CopyMultiInsertBuffer = NULL; > + > + ReleaseBulkInsertStatePin(buffer->bistate); Hm, afaict this still leaks the bistate itself? Greetings, Andres Freund