On 18/03/2021 20:05, John Naylor wrote:
I wrote:

 > I went ahead and rebased these.

Thanks!

I also wanted to see if this patch set had any performance effect, with and without changing how UTF-8 is validated, using the blackhole am from https://github.com/michaelpq/pg_plugins/tree/master/blackhole_am <https://github.com/michaelpq/pg_plugins/tree/master/blackhole_am>.

create extension blackhole_am;
create table blackhole_tab (a text) using blackhole_am ;
time ./inst/bin/psql -c "copy blackhole_tab from '/path/to/test-copy.txt'"

....where copy-test.txt is made by

for i in {1..100}; do cat UTF-8-Sampler.htm >> test-copy.txt ;  done;

On Linux x86-64, gcc 8.4, I get these numbers (minimum of five runs):

master:
109ms

v6 do encoding in larger chunks:
109ms

v7 utf8 SIMD:
98ms

That's disappointing. Perhaps the file size is just too small to see the effect? I'm seeing results between 40 ms and 75 ms on my laptop when I run a test like that multiple times. I used "WHERE false" instead of the blackhole AM but I don't think that makes much difference (only showing a few runs here for brevity):

for i in {1..100}; do cat /tmp/utf8.html >> /tmp/test-copy.txt ;  done;

postgres=# create table blackhole_tab (a text) ;
CREATE TABLE
postgres=# \timing
Timing is on.
postgres=# copy blackhole_tab  from '/tmp/test-copy.txt' where false;
COPY 0
Time: 53.166 ms
postgres=# copy blackhole_tab  from '/tmp/test-copy.txt' where false;
COPY 0
Time: 43.981 ms
postgres=# copy blackhole_tab  from '/tmp/test-copy.txt' where false;
COPY 0
Time: 71.850 ms
postgres=# copy blackhole_tab  from '/tmp/test-copy.txt' where false;
COPY 0
...

I tested that with a larger file:

for i in {1..10000}; do cat /tmp/utf8.html >> /tmp/test-copy.txt ;  done;
postgres=# copy blackhole_tab  from '/tmp/test-copy.txt' where false;

v6 do encoding in larger chunks (best of five):
Time: 3955.514 ms (00:03.956)

master (best of five):
Time: 4133.767 ms (00:04.134)

So with that, I'm seeing a measurable difference.

- Heikki


Reply via email to