On Thu, 9 Jan 2025 at 09:50, Jeff Davis <pg...@j-davis.com> wrote:
> Attached POC patch, which reduces memory usage by ~15% for a simple
> distinct query on an integer key. Performance is the same or perhaps a
> hair faster.
>
> It's not many lines of code, but the surrounding code might benefit
> from some refactoring which would make it a bit simpler.

Thanks for working on this. Here's a preliminary review:

Since bump.c does not add headers to the palloc'd chunks, I think the
following code from hash_agg_entry_size() shouldn't be using
CHUNKHDRSZ anymore.

tupleChunkSize = CHUNKHDRSZ + tupleSize;

if (pergroupSize > 0)
    pergroupChunkSize = CHUNKHDRSZ + pergroupSize;
else
    pergroupChunkSize = 0;

You should be able to get rid of pergroupChunkSize and just use
pergroupSize in the return.

I did some benchmarking using the attached script. There's a general
speedup, but I saw some unexpected increase in the number of batches
with the patched version on certain tests. See the attached results.
For example, the work_mem = 8MB with 10 million rows shows "Batches:
129" on master but "Batches: 641" with the patched version. I didn't
check why.

David
#!/bin/bash

dbname=postgres
secs=10
psql -c "alter system set max_parallel_workers_per_gather = 0;" $dbname > 
/dev/null
psql -c "alter system set jit = 0;" $dbname > /dev/null
psql -c "select pg_reload_conf();" $dbname > /dev/null
psql -c "create extension if not exists pg_prewarm;" $dbname > /dev/null
psql -c "drop table if exists hashagg;" $dbname > /dev/null
psql -c "create table hashagg (a bigint);" $dbname > /dev/null

for rows in 10000 100000 1000000 10000000
#for rows in 10000000
do
        psql -c "truncate table hashagg;" $dbname > /dev/null
        psql -c "insert into hashagg select a from generate_series(1, $rows) 
a;" $dbname > /dev/null
        psql -c "vacuum freeze analyze hashagg;" $dbname > /dev/null
        psql -c "select pg_prewarm('hashagg');" $dbname > /dev/null

        for work_mem in '512kB' '1MB' '2MB' '4MB' '8MB' '16MB' '32MB' '64MB' 
'128MB' '256MB'
        do
                psql -c "alter system set work_mem = '$work_mem';" $dbname > 
/dev/null
                psql -c "select pg_reload_conf();" $dbname > /dev/null
                echo "select a,count(*) from hashagg group by a;" > bench.sql
                psql -c "explain analyze select a,count(*) from hashagg group 
by a;" $dbname | grep "Batches" | tr '\n' ' '
                echo -n "$rows $work_mem "
                for i in {1..3}
                do
                        pgbench -n -f bench.sql -M prepared -T $secs $dbname | 
grep latency | sed 's/[^0-9.]//g' | tr '\n' ' '
                done
                echo "ms"
        done
done
master @ 231006451

$ ./hashagg_bench.sh
   Batches: 5  Memory Usage: 1073kB  Disk Usage: 200kB 10000 512kB 4.110 4.090 
4.061 ms
   Batches: 1  Memory Usage: 1041kB 10000 1MB 3.735 3.721 3.703 ms
   Batches: 1  Memory Usage: 1041kB 10000 2MB 3.710 3.714 3.709 ms
   Batches: 1  Memory Usage: 1169kB 10000 4MB 3.525 3.495 3.506 ms
   Batches: 1  Memory Usage: 1425kB 10000 8MB 3.503 3.475 3.521 ms
   Batches: 1  Memory Usage: 1425kB 10000 16MB 3.513 3.508 3.511 ms
   Batches: 1  Memory Usage: 1425kB 10000 32MB 3.521 3.520 3.499 ms
   Batches: 1  Memory Usage: 1425kB 10000 64MB 3.527 3.514 3.513 ms
   Batches: 1  Memory Usage: 1425kB 10000 128MB 3.508 3.502 3.502 ms
   Batches: 1  Memory Usage: 1425kB 10000 256MB 3.488 3.512 3.515 ms
   Planned Partitions: 16  Batches: 17  Memory Usage: 1041kB  Disk Usage: 
3040kB 100000 512kB 46.576 46.636 46.531 ms
   Planned Partitions: 8  Batches: 9  Memory Usage: 2065kB  Disk Usage: 3416kB 
100000 1MB 47.087 47.063 47.024 ms
   Planned Partitions: 4  Batches: 5  Memory Usage: 4145kB  Disk Usage: 1768kB 
100000 2MB 44.445 44.596 44.609 ms
   Batches: 5  Memory Usage: 8241kB  Disk Usage: 728kB 100000 4MB 43.944 44.125 
43.987 ms
   Batches: 1  Memory Usage: 12817kB 100000 8MB 48.186 47.549 47.597 ms
   Batches: 1  Memory Usage: 13329kB 100000 16MB 46.805 46.795 46.685 ms
   Batches: 1  Memory Usage: 14353kB 100000 32MB 46.127 46.575 46.087 ms
   Batches: 1  Memory Usage: 14353kB 100000 64MB 46.536 46.173 46.847 ms
   Batches: 1  Memory Usage: 14353kB 100000 128MB 46.770 45.984 46.219 ms
   Batches: 1  Memory Usage: 14353kB 100000 256MB 46.706 46.043 46.078 ms
   Planned Partitions: 32  Batches: 217  Memory Usage: 1105kB  Disk Usage: 
30608kB 1000000 512kB 545.327 543.614 543.034 ms
   Planned Partitions: 64  Batches: 65  Memory Usage: 2193kB  Disk Usage: 
28648kB 1000000 1MB 476.331 477.259 476.427 ms
   Planned Partitions: 32  Batches: 33  Memory Usage: 4113kB  Disk Usage: 
30584kB 1000000 2MB 484.892 484.598 487.688 ms
   Planned Partitions: 16  Batches: 17  Memory Usage: 8337kB  Disk Usage: 
31328kB 1000000 4MB 490.869 489.658 492.045 ms
   Planned Partitions: 8  Batches: 9  Memory Usage: 16465kB  Disk Usage: 
23936kB 1000000 8MB 523.156 524.944 521.537 ms
   Planned Partitions: 4  Batches: 5  Memory Usage: 32817kB  Disk Usage: 
19864kB 1000000 16MB 573.808 573.492 575.078 ms
   Planned Partitions: 4  Batches: 5  Memory Usage: 65585kB  Disk Usage: 
11608kB 1000000 32MB 611.767 609.259 609.378 ms
   Batches: 1  Memory Usage: 114705kB 1000000 64MB 609.220 605.576 608.327 ms
   Batches: 1  Memory Usage: 114705kB 1000000 128MB 611.371 618.847 623.132 ms
   Batches: 1  Memory Usage: 114705kB 1000000 256MB 618.622 616.254 618.368 ms
   Planned Partitions: 32  Batches: 4182  Memory Usage: 1105kB  Disk Usage: 
292240kB 10000000 512kB 6282.199 6159.717 6166.364 ms
   Planned Partitions: 64  Batches: 1001  Memory Usage: 2193kB  Disk Usage: 
322784kB 10000000 1MB 6024.200 6051.042 6047.530 ms
   Planned Partitions: 128  Batches: 641  Memory Usage: 4241kB  Disk Usage: 
384136kB 10000000 2MB 5784.646 5876.862 5816.221 ms
   Planned Partitions: 256  Batches: 257  Memory Usage: 8465kB  Disk Usage: 
506976kB 10000000 4MB 5408.663 5395.462 5404.561 ms
   Planned Partitions: 128  Batches: 129  Memory Usage: 16401kB  Disk Usage: 
384112kB 10000000 8MB 5602.240 5570.921 5608.489 ms
   Planned Partitions: 64  Batches: 65  Memory Usage: 33297kB  Disk Usage: 
322656kB 10000000 16MB 6551.026 6594.607 6477.449 ms
   Planned Partitions: 32  Batches: 33  Memory Usage: 65809kB  Disk Usage: 
259968kB 10000000 32MB 7553.664 7540.932 7462.904 ms
   Planned Partitions: 16  Batches: 17  Memory Usage: 131217kB  Disk Usage: 
244400kB 10000000 64MB 8147.225 8086.363 8083.296 ms
   Planned Partitions: 8  Batches: 9  Memory Usage: 262225kB  Disk Usage: 
211616kB 10000000 128MB 8358.746 8312.645 8333.737 ms
   Planned Partitions: 4  Batches: 5  Memory Usage: 524337kB  Disk Usage: 
134640kB 10000000 256MB 7950.195 7943.805 7963.620 ms
   
   
Jeff's bump alloc hashagg patch v1:

   Batches: 1  Memory Usage: 921kB 10000 512kB 3.607 3.634 3.631 ms
   Batches: 1  Memory Usage: 921kB 10000 1MB 3.534 3.521 3.588 ms
   Batches: 1  Memory Usage: 921kB 10000 2MB 3.591 3.537 3.579 ms
   Batches: 1  Memory Usage: 921kB 10000 4MB 3.589 3.598 3.589 ms
   Batches: 1  Memory Usage: 921kB 10000 8MB 3.587 3.598 3.580 ms
   Batches: 1  Memory Usage: 921kB 10000 16MB 3.583 3.623 3.567 ms
   Batches: 1  Memory Usage: 921kB 10000 32MB 3.565 3.601 3.581 ms
   Batches: 1  Memory Usage: 921kB 10000 64MB 3.573 3.573 3.586 ms
   Batches: 1  Memory Usage: 921kB 10000 128MB 3.569 3.580 3.590 ms
   Batches: 1  Memory Usage: 921kB 10000 256MB 3.607 3.583 3.577 ms
   Planned Partitions: 16  Batches: 17  Memory Usage: 1049kB  Disk Usage: 
3040kB 100000 512kB 46.054 46.249 45.926 ms
   Planned Partitions: 8  Batches: 9  Memory Usage: 1881kB  Disk Usage: 3392kB 
100000 1MB 46.470 46.493 46.507 ms
   Planned Partitions: 4  Batches: 5  Memory Usage: 3641kB  Disk Usage: 1680kB 
100000 2MB 43.501 43.606 43.470 ms
   Batches: 5  Memory Usage: 10297kB  Disk Usage: 208kB 100000 4MB 47.999 
47.991 48.039 ms
   Batches: 1  Memory Usage: 10265kB 100000 8MB 44.940 45.104 45.018 ms
   Batches: 1  Memory Usage: 10265kB 100000 16MB 44.832 44.912 44.783 ms
   Batches: 1  Memory Usage: 10265kB 100000 32MB 44.550 44.923 44.819 ms
   Batches: 1  Memory Usage: 10265kB 100000 64MB 44.578 44.832 44.827 ms
   Batches: 1  Memory Usage: 10265kB 100000 128MB 44.792 45.085 44.813 ms
   Batches: 1  Memory Usage: 10265kB 100000 256MB 44.777 44.901 44.789 ms
   Planned Partitions: 32  Batches: 657  Memory Usage: 1473kB  Disk Usage: 
30608kB 1000000 512kB 602.477 602.585 603.783 ms
   Planned Partitions: 64  Batches: 65  Memory Usage: 2329kB  Disk Usage: 
28648kB 1000000 1MB 470.372 472.783 473.525 ms
   Planned Partitions: 32  Batches: 33  Memory Usage: 3865kB  Disk Usage: 
30568kB 1000000 2MB 487.759 487.002 485.999 ms
   Planned Partitions: 16  Batches: 81  Memory Usage: 10393kB  Disk Usage: 
31296kB 1000000 4MB 547.232 549.565 550.964 ms
   Planned Partitions: 8  Batches: 41  Memory Usage: 20569kB  Disk Usage: 
23800kB 1000000 8MB 595.448 594.752 595.890 ms
   Planned Partitions: 4  Batches: 21  Memory Usage: 41017kB  Disk Usage: 
19256kB 1000000 16MB 678.615 674.505 674.172 ms
   Planned Partitions: 4  Batches: 5  Memory Usage: 81977kB  Disk Usage: 7528kB 
1000000 32MB 631.274 627.113 628.018 ms
   Batches: 1  Memory Usage: 90137kB 1000000 64MB 584.076 585.099 582.715 ms
   Batches: 1  Memory Usage: 90137kB 1000000 128MB 584.275 579.396 581.503 ms
   Batches: 1  Memory Usage: 90137kB 1000000 256MB 577.607 589.093 584.943 ms
   Planned Partitions: 32  Batches: 8469  Memory Usage: 1473kB  Disk Usage: 
292240kB 10000000 512kB 6870.746 6877.693 6903.483 ms
   Planned Partitions: 64  Batches: 4009  Memory Usage: 2881kB  Disk Usage: 
322784kB 10000000 1MB 6841.746 6809.542 6816.949 ms
   Planned Partitions: 128  Batches: 2673  Memory Usage: 5185kB  Disk Usage: 
384136kB 10000000 2MB 6449.305 6446.077 6438.334 ms
   Planned Partitions: 256  Batches: 257  Memory Usage: 9241kB  Disk Usage: 
506976kB 10000000 4MB 5359.919 5360.792 5365.025 ms
   Planned Partitions: 128  Batches: 641  Memory Usage: 21529kB  Disk Usage: 
384104kB 10000000 8MB 6663.404 6636.793 6632.275 ms
   Planned Partitions: 64  Batches: 321  Memory Usage: 41497kB  Disk Usage: 
322616kB 10000000 16MB 8039.401 8034.878 8046.826 ms
   Planned Partitions: 32  Batches: 161  Memory Usage: 82201kB  Disk Usage: 
259840kB 10000000 32MB 8917.696 9073.211 8933.217 ms
   Planned Partitions: 16  Batches: 17  Memory Usage: 163993kB  Disk Usage: 
243832kB 10000000 64MB 8144.712 8089.529 8164.517 ms
   Planned Partitions: 8  Batches: 9  Memory Usage: 311385kB  Disk Usage: 
195776kB 10000000 128MB 8258.068 8346.843 8250.325 ms
   Planned Partitions: 4  Batches: 5  Memory Usage: 630841kB  Disk Usage: 
113920kB 10000000 256MB 7946.322 8081.977 7913.341 ms

Reply via email to