On 2019-04-03 18:44:32 +1300, David Rowley wrote:
> (Fixed all of what you mentioned)
> 
> On Wed, 3 Apr 2019 at 06:57, Andres Freund <and...@anarazel.de> wrote:
> > > +/*
> > > + * CopyMultiInsertInfo_Flush
> > > + *           Write out all stored tuples in all buffers out to the 
> > > tables.
> > > + *
> > > + * To save us from ending up with buffers for 1000s of partitions we 
> > > remove
> > > + * buffers belonging to partitions that we've seen no tuples for in this 
> > > batch
> >
> > That seems a little naive (imagine you have like 5 partitions, and we
> > constantly cycle through 2-3 of them per batch).  It's probably OK for
> > this version.   I guess I'd only do that cleanup if we're actually
> > flushing due to the number of partitions.
> 
> hmm good point.  It seems like being smarter there would be a good thing.
> 
> I've ended up getting rid of the hash table in favour of the List that
> you mentioned and storing the buffer in ResultRelInfo.


Cool.

> I also changed
> the logic that removes buffers once we reach the limit.  Instead of
> getting rid of buffers that were not used on this run, I've changed it
> so it just gets rid of the buffers starting with the oldest one first,
> but stops once the number of buffers is at the maximum again.  This
> can mean that we end up with MAX_BUFFERED_TUPLES buffers instead of
> MAX_PARTITION_BUFFERS if there is only 1 tuple per buffer.  My current
> thinking is that this does not matter since only 1 slot will be
> allocated per buffer.  We'll remove all of the excess buffers during
> the flush and keep just MAX_PARTITION_BUFFERS of the newest buffers.

Yea, that makes sense to me.


> Also, after changing CopyMultiInsertBuffer to use fixed sized arrays
> instead of allocating them with another palloc the performance has
> improved a bit more.
> 
> Using the perl files mentioned in [1]
> 
> Master + Patched:
> # copy listp from program $$perl ~/bench_same.pl$$ delimiter '|';
> COPY 35651564
> Time: 9106.776 ms (00:09.107)
> # truncate table listp;
> TRUNCATE TABLE
> # copy listp from program $$perl ~/bench.pl$$ delimiter '|';
> COPY 35651564
> Time: 10154.196 ms (00:10.154)
> 
> 
> Master only:
> # copy listp from program $$perl ~/bench_same.pl$$ delimiter '|';
> COPY 35651564
> Time: 22200.535 ms (00:22.201)
> # truncate table listp;
> TRUNCATE TABLE
> # copy listp from program $$perl ~/bench.pl$$ delimiter '|';
> COPY 35651564
> Time: 18592.107 ms (00:18.592)

Awesome. I'm probably too tired to give you meaningful feedback
tonight. But I'll do a quick readthrough.



> +/* Class the buffer full if there are >= this many bytes of tuples stored */
> +#define MAX_BUFFERED_BYTES           65535

Random aside: This seems pretty small (but should be changed separately.


> +/* Trim the list of buffers back down to this number after flushing */
> +#define MAX_PARTITION_BUFFERS        32
> +
> +/* Stores multi-insert data related to a single relation in CopyFrom. */
> +typedef struct CopyMultiInsertBuffer
> +{
> +     TupleTableSlot *slots[MAX_BUFFERED_TUPLES]; /* Array to store tuples */
> +     ResultRelInfo *resultRelInfo;   /* ResultRelInfo for 'relid' */
> +     BulkInsertState bistate;        /* BulkInsertState for this rel */
> +     int                     nused;                  /* number of 'slots' 
> containing tuples */
> +     uint64          linenos[MAX_BUFFERED_TUPLES];   /* Line # of tuple in 
> copy
> +                                                                             
>                  * stream */
> +} CopyMultiInsertBuffer;

I don't think it's needed now, but you'd probably achieve a bit better
locality by storing slots + linenos in a single array of (slot,lineno).

> +/*
> + * CopyMultiInsertBuffer_Init
> + *           Allocate memory and initialize a new CopyMultiInsertBuffer for 
> this
> + *           ResultRelInfo.
> + */
> +static CopyMultiInsertBuffer *
> +CopyMultiInsertBuffer_Init(ResultRelInfo *rri)
> +{
> +     CopyMultiInsertBuffer *buffer;
> +
> +     buffer = (CopyMultiInsertBuffer *) 
> palloc(sizeof(CopyMultiInsertBuffer));
> +     memset(buffer->slots, 0, sizeof(TupleTableSlot *) * 
> MAX_BUFFERED_TUPLES);

Is there a reason to not just directly palloc0?


> +/*
> + * CopyMultiInsertBuffer_Cleanup
> + *           Drop used slots and free member for this buffer.  The buffer
> + *           must be flushed before cleanup.
> + */
> +static inline void
> +CopyMultiInsertBuffer_Cleanup(CopyMultiInsertBuffer *buffer)
> +{
> +     int                     i;
> +
> +     /* Ensure buffer was flushed */
> +     Assert(buffer->nused == 0);
> +
> +     /* Remove back-link to ourself */
> +     buffer->resultRelInfo->ri_CopyMultiInsertBuffer = NULL;
> +
> +     ReleaseBulkInsertStatePin(buffer->bistate);

Hm, afaict this still leaks the bistate itself?


Greetings,

Andres Freund


Reply via email to