Hi, In <10025bac-158c-ffe7-fbec-32b426291...@dunslane.net> "Re: Make COPY format extendable: Extract COPY TO format implementations" on Wed, 24 Jan 2024 07:15:55 -0500, Andrew Dunstan <and...@dunslane.net> wrote:
> > On 2024-01-24 We 03:11, Michael Paquier wrote: >> On Wed, Jan 24, 2024 at 02:49:36PM +0900, Sutou Kouhei wrote: >>> For COPY TO: >>> >>> 0001: This adds CopyToRoutine and use it for text/csv/binary >>> formats. No implementation change. This just move codes. >> 10M without this change: >> >> format,elapsed time (ms) >> text,1090.763 >> csv,1136.103 >> binary,1137.141 >> >> 10M with this change: >> >> format,elapsed time (ms) >> text,1082.654 >> csv,1196.991 >> binary,1069.697 >> >> These numbers point out that binary is faster by 6%, csv is slower by >> 5%, while text stays around what looks like noise range. That's not >> negligible. Are these numbers reproducible? If they are, that could >> be a problem for anybody doing bulk-loading of large data sets. I am >> not sure to understand where the improvement for binary comes from by >> reading the patch, but perhaps perf would tell more for each format? >> The loss with csv could be blamed on the extra manipulations of the >> function pointers, likely. > > > I don't think that's at all acceptable. > > We've spent quite a lot of blood sweat and tears over the years to make COPY > fast, and we should not sacrifice any of that lightly. These numbers aren't reproducible. Because these benchmarks executed on my normal machine not a machine only for benchmarking. The machine runs another processes such as editor and Web browser. For example, here are some results with master (94edfe250c6a200d2067b0debfe00b4122e9b11e): Format,N records,Elapsed time (ms) csv,10000000,1073.715 csv,10000000,1022.830 csv,10000000,1073.584 csv,10000000,1090.651 csv,10000000,1052.259 Here are some results with master + the 0001 patch: Format,N records,Elapsed time (ms) csv,10000000,1025.356 csv,10000000,1067.202 csv,10000000,1014.563 csv,10000000,1032.088 csv,10000000,1058.110 I uploaded my benchmark script so that you can run the same benchmark on your machine: https://gist.github.com/kou/be02e02e5072c91969469dbf137b5de5 Could anyone try the benchmark with master and master+0001? Thanks, -- kou