On Mon, Jul 15, 2024 at 4:52 PM Stepan Neretin <sncf...@gmail.com> wrote:
> > > On Mon, Jul 15, 2024 at 12:23 PM Антуан Виолин <violin.ant...@gmail.com> > wrote: > >> Hello all. >>> >>> I have decided to explore more areas in which I can optimize and have >>> added >>> two new benchmarks. Do you have any thoughts on this? >>> >>> postgres=# select bench_int16_sort(1000000); >>> bench_int16_sort >>> >>> >>> ----------------------------------------------------------------------------------------------------------------- >>> Time taken by usual sort: 66354981 ns, Time taken by optimized sort: >>> 52151523 ns, Percentage difference: 21.41% >>> (1 row) >>> >>> postgres=# select bench_float8_sort(1000000); >>> bench_float8_sort >>> >>> >>> ------------------------------------------------------------------------------------------------------------------ >>> Time taken by usual sort: 121475231 ns, Time taken by optimized sort: >>> 74458545 ns, Percentage difference: 38.70% >>> (1 row) >>> >> >> Hello all >> We would like to see the relationship between the length of the sorted >> array and the performance gain, perhaps some graphs. We also want to see >> to set a "worst case" test, sorting the array in ascending order when it >> is initially descending >> >> Best, regards, Antoine Violin >> >> postgres=# >>> >> >> On Mon, Jul 15, 2024 at 10:32 AM Stepan Neretin <sncf...@gmail.com> >> wrote: >> >>> >>> >>> On Sat, Jun 8, 2024 at 1:50 AM Stepan Neretin <sncf...@gmail.com> wrote: >>> >>>> Hello all. >>>> >>>> I am interested in the proposed patch and would like to propose some >>>> additional changes that would complement it. My changes would introduce >>>> similar optimizations when working with a list of integers or object >>>> identifiers. Additionally, my patch includes an extension for >>>> benchmarking, which shows an average speedup of 30-40%. >>>> >>>> postgres=# SELECT bench_oid_sort(1000000); >>>> bench_oid_sort >>>> >>>> >>>> ---------------------------------------------------------------------------------------------------------------- >>>> Time taken by list_sort: 116990848 ns, Time taken by list_oid_sort: >>>> 80446640 ns, Percentage difference: 31.24% >>>> (1 row) >>>> >>>> postgres=# SELECT bench_int_sort(1000000); >>>> bench_int_sort >>>> >>>> >>>> ---------------------------------------------------------------------------------------------------------------- >>>> Time taken by list_sort: 118168506 ns, Time taken by list_int_sort: >>>> 80523373 ns, Percentage difference: 31.86% >>>> (1 row) >>>> >>>> What do you think about these changes? >>>> >>>> Best regards, Stepan Neretin. >>>> >>>> On Fri, Jun 7, 2024 at 11:08 PM Andrey M. Borodin <x4...@yandex-team.ru> >>>> wrote: >>>> >>>>> Hi! >>>>> >>>>> In a thread about sorting comparators[0] Andres noted that we have >>>>> infrastructure to help compiler optimize sorting. PFA attached PoC >>>>> implementation. I've checked that it indeed works on the benchmark from >>>>> that thread. >>>>> >>>>> postgres=# CREATE TABLE arrays_to_sort AS >>>>> SELECT array_shuffle(a) arr >>>>> FROM >>>>> (SELECT ARRAY(SELECT generate_series(1, 1000000)) a), >>>>> generate_series(1, 10); >>>>> >>>>> postgres=# SELECT (sort(arr))[1] FROM arrays_to_sort; -- original >>>>> Time: 990.199 ms >>>>> postgres=# SELECT (sort(arr))[1] FROM arrays_to_sort; -- patched >>>>> Time: 696.156 ms >>>>> >>>>> The benefit seems to be on the order of magnitude with 30% speedup. >>>>> >>>>> There's plenty of sorting by TransactionId, BlockNumber, OffsetNumber, >>>>> Oid etc. But this sorting routines never show up in perf top or something >>>>> like that. >>>>> >>>>> Seems like in most cases we do not spend much time in sorting. But >>>>> specialization does not cost us much too, only some CPU cycles of a >>>>> compiler. I think we can further improve speedup by converting inline >>>>> comparator to value extractor: more compilers will see what is actually >>>>> going on. But I have no proofs for this reasoning. >>>>> >>>>> What do you think? >>>>> >>>>> >>>>> Best regards, Andrey Borodin. >>>>> >>>>> [0] >>>>> https://www.postgresql.org/message-id/flat/20240209184014.sobshkcsfjix6u4r%40awork3.anarazel.de#fc23df2cf314bef35095b632380b4a59 >>>> >>>> >>> Hello all. >>> >>> I have decided to explore more areas in which I can optimize and have >>> added two new benchmarks. Do you have any thoughts on this? >>> >>> postgres=# select bench_int16_sort(1000000); >>> bench_int16_sort >>> >>> >>> ----------------------------------------------------------------------------------------------------------------- >>> Time taken by usual sort: 66354981 ns, Time taken by optimized sort: >>> 52151523 ns, Percentage difference: 21.41% >>> (1 row) >>> >>> postgres=# select bench_float8_sort(1000000); >>> bench_float8_sort >>> >>> >>> ------------------------------------------------------------------------------------------------------------------ >>> Time taken by usual sort: 121475231 ns, Time taken by optimized sort: >>> 74458545 ns, Percentage difference: 38.70% >>> (1 row) >>> >>> postgres=# >>> >>> Best regards, Stepan Neretin. >>> >> > > I run benchmark with my patches: > ./pgbench -c 10 -j2 -t1000 -d postgres > > pgbench (18devel) > starting vacuum...end. > transaction type: <builtin: TPC-B (sort of)> > scaling factor: 10 > query mode: simple > number of clients: 10 > number of threads: 2 > maximum number of tries: 1 > number of transactions per client: 1000 > number of transactions actually processed: 10000/10000 > number of failed transactions: 0 (0.000%) > latency average = 1.609 ms > initial connection time = 24.080 ms > tps = 6214.244789 (without initial connection time) > > and without: > ./pgbench -c 10 -j2 -t1000 -d postgres > > pgbench (18devel) > starting vacuum...end. > transaction type: <builtin: TPC-B (sort of)> > scaling factor: 10 > query mode: simple > number of clients: 10 > number of threads: 2 > maximum number of tries: 1 > number of transactions per client: 1000 > number of transactions actually processed: 10000/10000 > number of failed transactions: 0 (0.000%) > latency average = 1.731 ms > initial connection time = 15.177 ms > tps = 5776.173285 (without initial connection time) > > tps with my patches increase. What do you think? > > Best regards, Stepan Neretin. > I implement reverse benchmarks: postgres=# SELECT bench_oid_reverse_sort(1000); bench_oid_reverse_sort ---------------------------------------------------------------------------------------------------------- Time taken by list_sort: 182557 ns, Time taken by list_oid_sort: 85864 ns, Percentage difference: 52.97% (1 row) Time: 2,291 ms postgres=# SELECT bench_oid_reverse_sort(100000); bench_oid_reverse_sort ------------------------------------------------------------------------------------------------------------- Time taken by list_sort: 9064163 ns, Time taken by list_oid_sort: 4313448 ns, Percentage difference: 52.41% (1 row) Time: 17,146 ms postgres=# SELECT bench_oid_reverse_sort(1000000); bench_oid_reverse_sort --------------------------------------------------------------------------------------------------------------- Time taken by list_sort: 61990395 ns, Time taken by list_oid_sort: 23703380 ns, Percentage difference: 61.76% (1 row) postgres=# SELECT bench_int_reverse_sort(1000000); bench_int_reverse_sort --------------------------------------------------------------------------------------------------------------- Time taken by list_sort: 50712416 ns, Time taken by list_int_sort: 24120417 ns, Percentage difference: 52.44% (1 row) Time: 89,359 ms postgres=# SELECT bench_float8_reverse_sort(1000000); bench_float8_reverse_sort ----------------------------------------------------------------------------------------------------------------- Time taken by usual sort: 57447775 ns, Time taken by optimized sort: 25214023 ns, Percentage difference: 56.11% (1 row) Time: 92,308 ms Hello again. I want to show you the graphs of when we increase the length vector/array sorting time (ns). What do you think about graphs? Best regards, Stepan Neretin.