Re: Re: Perform COPY FROM encoding conversions in larger chunks

2021-05-01 Thread Chapman Flack
On 04/01/21 05:27, Heikki Linnakangas wrote: > I read through the patches one more time, made a few small comment fixes, > and pushed. Wow, this whole thread escaped my attention at the time, though my ears would have perked right up if the subject had been something like 'improve encoding convers

Re: Perform COPY FROM encoding conversions in larger chunks

2021-04-01 Thread Heikki Linnakangas
On 01/04/2021 11:09, Heikki Linnakangas wrote: On 18/03/2021 20:05, John Naylor wrote: I wrote: > I went ahead and rebased these. Thanks! I read through the patches one more time, made a few small comment fixes, and pushed. - Heikki

Re: Perform COPY FROM encoding conversions in larger chunks

2021-04-01 Thread Heikki Linnakangas
On 18/03/2021 20:05, John Naylor wrote: I wrote: > I went ahead and rebased these. Thanks! I also wanted to see if this patch set had any performance effect, with and without changing how UTF-8 is validated, using the blackhole am from https://github.com/michaelpq/pg_plugins/tree/master/bl

Re: Perform COPY FROM encoding conversions in larger chunks

2021-03-18 Thread John Naylor
On Thu, Mar 18, 2021 at 2:05 PM John Naylor wrote: > > I wrote: > > > I went ahead and rebased these. > > It looks like FreeBSD doesn't like this for some reason. On closer examination, make check was "terminated", not that the tests failed... -- John Naylor EDB: http://www.enterprisedb.com

Re: Perform COPY FROM encoding conversions in larger chunks

2021-03-18 Thread John Naylor
I wrote: > I went ahead and rebased these. It looks like FreeBSD doesn't like this for some reason. I also wanted to see if this patch set had any performance effect, with and without changing how UTF-8 is validated, using the blackhole am from https://github.com/michaelpq/pg_plugins/tree/master

Re: Perform COPY FROM encoding conversions in larger chunks

2021-02-09 Thread Heikki Linnakangas
On 09/02/2021 15:40, John Naylor wrote: On Sun, Feb 7, 2021 at 2:13 PM Heikki Linnakangas > wrote: > > On 02/02/2021 23:42, John Naylor wrote: > > > > In copyfromparse.c, this is now out of date: > > > >   * Read the next input line and stash it in line_buf, with co

Re: Perform COPY FROM encoding conversions in larger chunks

2021-02-09 Thread John Naylor
On Sun, Feb 7, 2021 at 2:13 PM Heikki Linnakangas wrote: > > On 02/02/2021 23:42, John Naylor wrote: > > > > In copyfromparse.c, this is now out of date: > > > > * Read the next input line and stash it in line_buf, with conversion to > > * server encoding. This comment for CopyReadLine() is s

Re: Perform COPY FROM encoding conversions in larger chunks

2021-02-02 Thread John Naylor
On Mon, Feb 1, 2021 at 12:15 PM Heikki Linnakangas wrote: > Thanks. I fixed it slightly differently, and also changed LocalToUtf() > to follow the same pattern, even though LocalToUtf() did not have the > same bug. Looks good to me. > I added a bunch of tests for various built-in conversions. N

Re: Perform COPY FROM encoding conversions in larger chunks

2021-01-30 Thread John Naylor
On Thu, Jan 28, 2021 at 7:36 AM Heikki Linnakangas wrote: > > Even more surprising was that the second patch > (0002-Replace-pg_utf8_verifystr-with-a-faster-implementati.patch) > actually made things worse again. I thought it would give a modest gain, > but nope. Hmm, that surprised me too. > Ba

Re: Perform COPY FROM encoding conversions in larger chunks

2021-01-28 Thread Heikki Linnakangas
On 28/01/2021 01:23, John Naylor wrote: Hi Heikki, 0001 through 0003 are straightforward, and I think they can be committed now if you like. 0004 is also pretty straightforward. The check you proposed upthread for pg_upgrade seems like the best solution to make that workable. I'll take a lo

Re: Perform COPY FROM encoding conversions in larger chunks

2021-01-28 Thread Heikki Linnakangas
On 28/01/2021 01:23, John Naylor wrote: Hi Heikki, 0001 through 0003 are straightforward, and I think they can be committed now if you like. Thanks for the review! I did some more rigorous microbenchmarking of patch 1 and 2. I used the attached test script, which calls convert_from() functi

Re: Perform COPY FROM encoding conversions in larger chunks

2021-01-27 Thread John Naylor
Hi Heikki, 0001 through 0003 are straightforward, and I think they can be committed now if you like. 0004 is also pretty straightforward. The check you proposed upthread for pg_upgrade seems like the best solution to make that workable. I'll take a look at 0005 soon. I measured the conversions t

Re: Perform COPY FROM encoding conversions in larger chunks

2020-12-23 Thread John Naylor
On Wed, Dec 23, 2020 at 3:41 AM Heikki Linnakangas wrote: > > I'm not sure it's worth the trouble, though. Custom conversions are very > rare. And I don't think any other object can depend on a conversion, so > you can always drop the conversion before upgrade, and re-create it with > the new fun

Re: Perform COPY FROM encoding conversions in larger chunks

2020-12-22 Thread Heikki Linnakangas
On 22/12/2020 22:01, John Naylor wrote: In 0004, it seems you have some doubts about upgrade compatibility. Is that because user-defined conversions would no longer have the right signature? Exactly. If you have an extension that adds a custom conversion function and does CREATE CONVERSION, t

Re: Perform COPY FROM encoding conversions in larger chunks

2020-12-22 Thread John Naylor
On Wed, Dec 16, 2020 at 8:18 AM Heikki Linnakangas wrote: > > Currently, COPY FROM parses the input one line at a time. Each line is > converted to the database encoding separately, or if the file encoding > matches the database encoding, we just check that the input is valid for > the encoding. I

Re: Perform COPY FROM encoding conversions in larger chunks

2020-12-17 Thread Bruce Momjian
On Wed, Dec 16, 2020 at 02:17:58PM +0200, Heikki Linnakangas wrote: > I've been looking at the COPY FROM parsing code, trying to refactor it so > that the parallel COPY would be easier to implement. I haven't touched > parallelism itself, just looking for ways to smoothen the way. And for ways > to