Hi, On 2023-07-27 20:53:16 +1200, David Rowley wrote: > To summarise, REL_15_STABLE can run this benchmark in 526.014 ms on my > AMD 3990x machine. Today's REL_16_STABLE takes 530.344 ms. We're > talking about another patch to speed up the pg_strtoint functions > which gets this down to 483.790 ms. Do we need to do this for v16 or > can we just leave this as it is already? The slowdown does not seem > to be much above what we'd ordinarily classify as noise using this > test on my machine.
I think we need to do something for 16 - it appears on recent-ish AMD the regression is quite a bit smaller than on intel. You see something < 1%, I see more like 4%. I think there's also other cases where the slowdown is more substantial. Besides intel vs amd, it also looks like the gcc version might make a difference. The code generated by 13 is noticeably slower than 12 for me... > Benchmark setup: > > COPY (SELECT generate_series(1, 2000000) a, (random() * 100000 - > 50000)::int b, 3243423 c) TO '/tmp/lotsaints.copy'; > DROP TABLE lotsaints; CREATE UNLOGGED TABLE lotsaints(a int, b int, c int); There's a lot of larger numbers in the file, which likely reduces the impact some. And there's the overhead of actually inserting the rows into the table, making the difference appear smaller than it is. If I avoid the actual insert into the table and use more columns, I see an about 10% regression here. COPY (SELECT generate_series(1, 1000) a, 10 b, 20 c, 30 d, 40 e, 50 f FROM generate_series(1, 10000)) TO '/tmp/lotsaints_wide.copy'; psql -c 'DROP TABLE IF EXISTS lotsaints_wide; CREATE UNLOGGED TABLE lotsaints_wide(a int, b int, c int, d int, e int, f int);' && \ pgbench -n -P1 -f <( echo "COPY lotsaints_wide FROM '/tmp/lotsaints_wide.copy' WHERE false") -t 5 15: 2992.605 HEAD: 3325.201 fastpath1.patch 2932.606 fastpath2.patch 2783.915 Interestingly fastpath1 is slower now, even though it wasn't with earlier patches (which still is repeatable). I do not have the foggiest as to why. Greetings, Andres Freund