Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-26 Thread Amit Langote
On Sun, Jul 26, 2020 at 6:06 AM Tom Lane wrote: > Amit Langote writes: > > [ v7-0001-Improve-performance-of-binary-COPY-FROM-with-buff.patch ] > > Pushed with cosmetic changes. Thanks for that. > I'd always supposed that stdio does enough internal buffering that short > fread()s shouldn't be mu

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-25 Thread Tom Lane
Amit Langote writes: > [ v7-0001-Improve-performance-of-binary-COPY-FROM-with-buff.patch ] Pushed with cosmetic changes. I'd always supposed that stdio does enough internal buffering that short fread()s shouldn't be much worse than memcpy(). But I reproduced your result of ~30% speedup for data

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-17 Thread Bharath Rupireddy
> > > > I will change the status to "ready for committer" in commitfest > > tomorrow. Hope that's fine. > > I agree, a committer can have a look at this. > I changed the status in the commit fest to "Ready for Committer". https://commitfest.postgresql.org/28/2632/ With Regards, Bharath Rupireddy

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-16 Thread vignesh C
On Thu, Jul 16, 2020 at 8:52 PM Bharath Rupireddy wrote: > > On Thu, Jul 16, 2020 at 7:44 PM Amit Langote wrote: > > > > On Wed, Jul 15, 2020 at 1:06 PM vignesh C wrote: > > > On Wed, Jul 15, 2020 at 8:03 AM Amit Langote > > > wrote: > > > > On Tue, Jul 14, 2020 at 10:23 PM vignesh C wrote: >

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-16 Thread Bharath Rupireddy
On Thu, Jul 16, 2020 at 7:44 PM Amit Langote wrote: > > On Wed, Jul 15, 2020 at 1:06 PM vignesh C wrote: > > On Wed, Jul 15, 2020 at 8:03 AM Amit Langote > > wrote: > > > On Tue, Jul 14, 2020 at 10:23 PM vignesh C wrote: > > > > Rest of the changes looked fine to me. Also I noticed that > > >

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-16 Thread Amit Langote
On Wed, Jul 15, 2020 at 1:06 PM vignesh C wrote: > On Wed, Jul 15, 2020 at 8:03 AM Amit Langote wrote: > > On Tue, Jul 14, 2020 at 10:23 PM vignesh C wrote: > > > Rest of the changes looked fine to me. Also I noticed that > > > commit message was missing in the patch. >> > > Please see the attac

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-14 Thread vignesh C
On Wed, Jul 15, 2020 at 8:03 AM Amit Langote wrote: > > Hi Vignesh, > > On Tue, Jul 14, 2020 at 10:23 PM vignesh C wrote: > > On Tue, Jul 14, 2020 at 11:19 AM Amit Langote > > wrote: > > > Sounds fine to me. Although CopyLoadRawBuf() does not seem to a > > > candidate for rigorous code optimiz

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-14 Thread Amit Langote
Hi Vignesh, On Tue, Jul 14, 2020 at 10:23 PM vignesh C wrote: > On Tue, Jul 14, 2020 at 11:19 AM Amit Langote wrote: > > Sounds fine to me. Although CopyLoadRawBuf() does not seem to a > > candidate for rigorous code optimization as it does not get called > > that often. > > I thought we could

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-14 Thread vignesh C
On Tue, Jul 14, 2020 at 11:19 AM Amit Langote wrote: > > > Sounds fine to me. Although CopyLoadRawBuf() does not seem to a > candidate for rigorous code optimization as it does not get called > that often. > I thought we could include that change as we are making changes around that code. Rest o

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-13 Thread Amit Langote
On Tue, Jul 14, 2020 at 2:02 PM vignesh C wrote: > On Tue, Jul 14, 2020 at 7:26 AM Amit Langote wrote: > > In CopyLoadRawBuf(), we could also change the condition if > > (cstate->raw_buf_index < cstate->raw_buf_len) to if (BUF_BYTES > 0), > > which looks clearer. > > > > Also, if we are going to

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-13 Thread vignesh C
On Tue, Jul 14, 2020 at 7:26 AM Amit Langote wrote: > > Good idea, thanks. > > In CopyLoadRawBuf(), we could also change the condition if > (cstate->raw_buf_index < cstate->raw_buf_len) to if (BUF_BYTES > 0), > which looks clearer. > > Also, if we are going to use the macro more generally, let's m

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-13 Thread Amit Langote
On Mon, Jul 13, 2020 at 11:58 PM Bharath Rupireddy wrote: > > I had one small comment: > > +{ > > + int copied_bytes = 0; > > + > > +#define BUF_BYTES (cstate->raw_buf_len - cstate->raw_buf_index) > > +#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\ > > + do {\ > > +

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-13 Thread Bharath Rupireddy
> > I had one small comment: > +{ > + int copied_bytes = 0; > + > +#define BUF_BYTES (cstate->raw_buf_len - cstate->raw_buf_index) > +#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\ > + do {\ > + memcpy((dest), (cstate)->raw_buf + > (cstate)->raw_buf_ind

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-13 Thread vignesh C
On Mon, Jul 13, 2020 at 8:02 AM Amit Langote wrote: > By the way, considering the rebase over cd22d3cdb9b, it seemed to me > that we needed to update the comments in CopyStateData struct > definition a bit more. While doing that, I realized > CopyReadFromRawBuf as a name for the new function migh

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-13 Thread Bharath Rupireddy
> > > > CopyReadFromRawBuf as a name for the new function might be misleading > > > as long as we are only using it for binary data. Maybe > > > CopyReadBinaryData is more appropriate? See attached v4 with these > > > and a few other cosmetic changes. > > > > > > > CopyReadBinaryData() looks mean

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-13 Thread Amit Langote
On Mon, Jul 13, 2020 at 12:17 PM Bharath Rupireddy wrote: > > CopyReadFromRawBuf as a name for the new function might be misleading > > as long as we are only using it for binary data. Maybe > > CopyReadBinaryData is more appropriate? See attached v4 with these > > and a few other cosmetic chang

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-12 Thread Bharath Rupireddy
> > > This is due to the recent commit > > cd22d3cdb9bd9963c694c01a8c0232bbae3ddcfb, in which we restricted the > > raw_buf and line_buf allocations for binary files. Since we are using > > raw_buf for this performance improvement feature, now, it's enough to > > restrict only line_buf for binary f

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-12 Thread Amit Langote
On Mon, Jul 13, 2020 at 10:19 AM Bharath Rupireddy wrote: > > On Mon, Jul 13, 2020 at 4:06 AM Thomas Munro wrote: > > > > This error showed up when cfbot tried it: > > > > COPY BINARY stud_emp FROM > > '/home/travis/build/postgresql-cfbot/postgresql/src/test/regress/results/stud_emp.data'; > > +

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-12 Thread Bharath Rupireddy
Thanks Thomas for checking this feature. > On Mon, Jul 13, 2020 at 4:06 AM Thomas Munro wrote: > > This error showed up when cfbot tried it: > > COPY BINARY stud_emp FROM > '/home/travis/build/postgresql-cfbot/postgresql/src/test/regress/results/stud_emp.data'; > +ERROR: could not read from COP

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-12 Thread Thomas Munro
On Mon, Jul 13, 2020 at 1:13 AM vignesh C wrote: > On Fri, Jul 10, 2020 at 8:51 AM Amit Langote wrote: > > Here the numbers with the updated patch: > > > > HEADpatched (v2) > > foo5 8.56.1 > > foo10 149.4 > > foo20 25 16.7 > > > > Patc

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-12 Thread vignesh C
On Fri, Jul 10, 2020 at 8:51 AM Amit Langote wrote: > > Hi Bharath, > Here the numbers with the updated patch: > > HEADpatched (v2) > foo5 8.56.1 > foo10 149.4 > foo20 25 16.7 > Patch applies cleanly, make check & make check-world pass

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-09 Thread Amit Langote
Hi Bharath, On Thu, Jul 9, 2020 at 7:33 PM Bharath Rupireddy wrote: > Thanks Amit for buying into the idea. I agree that your patch looks > clean and simple compared to mine and I'm okay with your patch. > > I reviewed and tested your patch, below are few comments: Thanks for checking it out. >

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-09 Thread Bharath Rupireddy
> > Considering these points, I came up with the attached patch with a > much smaller footprint. As far as I can see, it implements the same > basic idea as your patch. With it, I was able to see an improvement > in loading time consistent with your numbers. I measured the time of > loading 10 m

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-07 Thread Amit Langote
On Tue, Jul 7, 2020 at 4:28 PM Amit Langote wrote: > The median times for the COPY FROM commands above, with and without > the patch, are as follows: > > HEADpatched > foo58.5 6.5 > foo10 14 10 > foo20 25 18 Sorry, I forgot to mention that these t

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-07 Thread Amit Langote
Hi Bharath, On Mon, Jun 29, 2020 at 2:21 PM Bharath Rupireddy wrote: > For Copy From Binary files, there exists below information for each tuple/row. > 1. field count(number of columns) > 2. for every field, field size(column data length) > 3. field data of field size(actual column data) > > Curr

Re: [PATCH] Performance Improvement For Copy From Binary Files

2020-07-01 Thread Bharath Rupireddy
Hi, Added this to commitfest incase this is useful - https://commitfest.postgresql.org/28/ With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com On Mon, Jun 29, 2020 at 10:50 AM Bharath Rupireddy < bharath.rupireddyforpostg...@gmail.com> wrote: > Hi Hackers, > > For Copy Fr