On Sun, Jul 26, 2020 at 6:06 AM Tom Lane wrote:
> Amit Langote writes:
> > [ v7-0001-Improve-performance-of-binary-COPY-FROM-with-buff.patch ]
>
> Pushed with cosmetic changes.
Thanks for that.
> I'd always supposed that stdio does enough internal buffering that short
> fread()s shouldn't be mu
Amit Langote writes:
> [ v7-0001-Improve-performance-of-binary-COPY-FROM-with-buff.patch ]
Pushed with cosmetic changes.
I'd always supposed that stdio does enough internal buffering that short
fread()s shouldn't be much worse than memcpy(). But I reproduced your
result of ~30% speedup for data
> >
> > I will change the status to "ready for committer" in commitfest
> > tomorrow. Hope that's fine.
>
> I agree, a committer can have a look at this.
>
I changed the status in the commit fest to "Ready for Committer".
https://commitfest.postgresql.org/28/2632/
With Regards,
Bharath Rupireddy
On Thu, Jul 16, 2020 at 8:52 PM Bharath Rupireddy
wrote:
>
> On Thu, Jul 16, 2020 at 7:44 PM Amit Langote wrote:
> >
> > On Wed, Jul 15, 2020 at 1:06 PM vignesh C wrote:
> > > On Wed, Jul 15, 2020 at 8:03 AM Amit Langote
> > > wrote:
> > > > On Tue, Jul 14, 2020 at 10:23 PM vignesh C wrote:
>
On Thu, Jul 16, 2020 at 7:44 PM Amit Langote wrote:
>
> On Wed, Jul 15, 2020 at 1:06 PM vignesh C wrote:
> > On Wed, Jul 15, 2020 at 8:03 AM Amit Langote
> > wrote:
> > > On Tue, Jul 14, 2020 at 10:23 PM vignesh C wrote:
> > > > Rest of the changes looked fine to me. Also I noticed that
> > >
On Wed, Jul 15, 2020 at 1:06 PM vignesh C wrote:
> On Wed, Jul 15, 2020 at 8:03 AM Amit Langote wrote:
> > On Tue, Jul 14, 2020 at 10:23 PM vignesh C wrote:
> > > Rest of the changes looked fine to me. Also I noticed that
> > > commit message was missing in the patch.
>>
> > Please see the attac
On Wed, Jul 15, 2020 at 8:03 AM Amit Langote wrote:
>
> Hi Vignesh,
>
> On Tue, Jul 14, 2020 at 10:23 PM vignesh C wrote:
> > On Tue, Jul 14, 2020 at 11:19 AM Amit Langote
> > wrote:
> > > Sounds fine to me. Although CopyLoadRawBuf() does not seem to a
> > > candidate for rigorous code optimiz
Hi Vignesh,
On Tue, Jul 14, 2020 at 10:23 PM vignesh C wrote:
> On Tue, Jul 14, 2020 at 11:19 AM Amit Langote wrote:
> > Sounds fine to me. Although CopyLoadRawBuf() does not seem to a
> > candidate for rigorous code optimization as it does not get called
> > that often.
>
> I thought we could
On Tue, Jul 14, 2020 at 11:19 AM Amit Langote wrote:
>
>
> Sounds fine to me. Although CopyLoadRawBuf() does not seem to a
> candidate for rigorous code optimization as it does not get called
> that often.
>
I thought we could include that change as we are making changes around
that code. Rest o
On Tue, Jul 14, 2020 at 2:02 PM vignesh C wrote:
> On Tue, Jul 14, 2020 at 7:26 AM Amit Langote wrote:
> > In CopyLoadRawBuf(), we could also change the condition if
> > (cstate->raw_buf_index < cstate->raw_buf_len) to if (BUF_BYTES > 0),
> > which looks clearer.
> >
> > Also, if we are going to
On Tue, Jul 14, 2020 at 7:26 AM Amit Langote wrote:
>
> Good idea, thanks.
>
> In CopyLoadRawBuf(), we could also change the condition if
> (cstate->raw_buf_index < cstate->raw_buf_len) to if (BUF_BYTES > 0),
> which looks clearer.
>
> Also, if we are going to use the macro more generally, let's m
On Mon, Jul 13, 2020 at 11:58 PM Bharath Rupireddy
wrote:
> > I had one small comment:
> > +{
> > + int copied_bytes = 0;
> > +
> > +#define BUF_BYTES (cstate->raw_buf_len - cstate->raw_buf_index)
> > +#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\
> > + do {\
> > +
>
> I had one small comment:
> +{
> + int copied_bytes = 0;
> +
> +#define BUF_BYTES (cstate->raw_buf_len - cstate->raw_buf_index)
> +#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\
> + do {\
> + memcpy((dest), (cstate)->raw_buf +
> (cstate)->raw_buf_ind
On Mon, Jul 13, 2020 at 8:02 AM Amit Langote wrote:
> By the way, considering the rebase over cd22d3cdb9b, it seemed to me
> that we needed to update the comments in CopyStateData struct
> definition a bit more. While doing that, I realized
> CopyReadFromRawBuf as a name for the new function migh
>
> > > CopyReadFromRawBuf as a name for the new function might be misleading
> > > as long as we are only using it for binary data. Maybe
> > > CopyReadBinaryData is more appropriate? See attached v4 with these
> > > and a few other cosmetic changes.
> > >
> >
> > CopyReadBinaryData() looks mean
On Mon, Jul 13, 2020 at 12:17 PM Bharath Rupireddy
wrote:
> > CopyReadFromRawBuf as a name for the new function might be misleading
> > as long as we are only using it for binary data. Maybe
> > CopyReadBinaryData is more appropriate? See attached v4 with these
> > and a few other cosmetic chang
>
> > This is due to the recent commit
> > cd22d3cdb9bd9963c694c01a8c0232bbae3ddcfb, in which we restricted the
> > raw_buf and line_buf allocations for binary files. Since we are using
> > raw_buf for this performance improvement feature, now, it's enough to
> > restrict only line_buf for binary f
On Mon, Jul 13, 2020 at 10:19 AM Bharath Rupireddy
wrote:
> > On Mon, Jul 13, 2020 at 4:06 AM Thomas Munro wrote:
> >
> > This error showed up when cfbot tried it:
> >
> > COPY BINARY stud_emp FROM
> > '/home/travis/build/postgresql-cfbot/postgresql/src/test/regress/results/stud_emp.data';
> > +
Thanks Thomas for checking this feature.
> On Mon, Jul 13, 2020 at 4:06 AM Thomas Munro wrote:
>
> This error showed up when cfbot tried it:
>
> COPY BINARY stud_emp FROM
> '/home/travis/build/postgresql-cfbot/postgresql/src/test/regress/results/stud_emp.data';
> +ERROR: could not read from COP
On Mon, Jul 13, 2020 at 1:13 AM vignesh C wrote:
> On Fri, Jul 10, 2020 at 8:51 AM Amit Langote wrote:
> > Here the numbers with the updated patch:
> >
> > HEADpatched (v2)
> > foo5 8.56.1
> > foo10 149.4
> > foo20 25 16.7
> >
>
> Patc
On Fri, Jul 10, 2020 at 8:51 AM Amit Langote wrote:
>
> Hi Bharath,
> Here the numbers with the updated patch:
>
> HEADpatched (v2)
> foo5 8.56.1
> foo10 149.4
> foo20 25 16.7
>
Patch applies cleanly, make check & make check-world pass
Hi Bharath,
On Thu, Jul 9, 2020 at 7:33 PM Bharath Rupireddy
wrote:
> Thanks Amit for buying into the idea. I agree that your patch looks
> clean and simple compared to mine and I'm okay with your patch.
>
> I reviewed and tested your patch, below are few comments:
Thanks for checking it out.
>
>
> Considering these points, I came up with the attached patch with a
> much smaller footprint. As far as I can see, it implements the same
> basic idea as your patch. With it, I was able to see an improvement
> in loading time consistent with your numbers. I measured the time of
> loading 10 m
On Tue, Jul 7, 2020 at 4:28 PM Amit Langote wrote:
> The median times for the COPY FROM commands above, with and without
> the patch, are as follows:
>
> HEADpatched
> foo58.5 6.5
> foo10 14 10
> foo20 25 18
Sorry, I forgot to mention that these t
Hi Bharath,
On Mon, Jun 29, 2020 at 2:21 PM Bharath Rupireddy
wrote:
> For Copy From Binary files, there exists below information for each tuple/row.
> 1. field count(number of columns)
> 2. for every field, field size(column data length)
> 3. field data of field size(actual column data)
>
> Curr
Hi,
Added this to commitfest incase this is useful -
https://commitfest.postgresql.org/28/
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
On Mon, Jun 29, 2020 at 10:50 AM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:
> Hi Hackers,
>
> For Copy Fr
Hi Hackers,
For Copy From Binary files, there exists below information for each
tuple/row.
1. field count(number of columns)
2. for every field, field size(column data length)
3. field data of field size(actual column data)
Currently, all the above data required at each step is read directly from
27 matches
Mail list logo