Re: Using per-transaction memory contexts for storing decoded tuples

Shlok Kyal Fri, 27 Sep 2024 00:39:46 -0700

On Mon, 23 Sept 2024 at 09:59, Amit Kapila <amit.kapil...@gmail.com> wrote:
>
> On Sun, Sep 22, 2024 at 11:27 AM David Rowley <dgrowle...@gmail.com> wrote:
> >
> > On Fri, 20 Sept 2024 at 17:46, Amit Kapila <amit.kapil...@gmail.com> wrote:
> > >
> > > On Fri, Sep 20, 2024 at 5:13 AM David Rowley <dgrowle...@gmail.com> wrote:
> > > > In general, it's a bit annoying to have to code around this
> > > > GenerationContext fragmentation issue.
> > >
> > > Right, and I am also slightly afraid that this may not cause some
> > > regression in other cases where defrag wouldn't help.
> >
> > Yeah, that's certainly a possibility. I was hoping that
> > MemoryContextMemAllocated() being much larger than logical_work_mem
> > could only happen when there is fragmentation, but certainly, you
> > could be wasting effort trying to defrag transactions where the
> > changes all arrive in WAL consecutively and there is no
> > defragmentation. It might be some other large transaction that's
> > causing the context's allocations to be fragmented. I don't have any
> > good ideas on how to avoid wasting effort on non-problematic
> > transactions. Maybe there's something that could be done if we knew
> > the LSN of the first and last change and the gap between the LSNs was
> > much larger than the WAL space used for this transaction. That would
> > likely require tracking way more stuff than we do now, however.
> >
>
> With more information tracking, we could avoid some non-problematic
> transactions but still, it would be difficult to predict that we
> didn't harm many cases because to make the memory non-contiguous, we
> only need a few interleaving small transactions. We can try to think
> of ideas for implementing defragmentation in our code if we first can
> prove that smaller block sizes cause problems.
>
> > With the smaller blocks idea, I'm a bit concerned that using smaller
> > blocks could cause regressions on systems that are better at releasing
> > memory back to the OS after free() as no doubt malloc() would often be
> > slower on those systems. There have been some complaints recently
> > about glibc being a bit too happy to keep hold of memory after free()
> > and I wondered if that was the reason why the small block test does
> > not cause much of a performance regression. I wonder how the small
> > block test would look on Mac, FreeBSD or Windows. I think it would be
> > risky to assume that all is well with reducing the block size after
> > testing on a single platform.
> >
>
> Good point. We need extensive testing on different platforms, as you
> suggest, to verify if smaller block sizes caused any regressions.


I did similar tests on Windows. rb_mem_block_size was changed from 8kB
to 8MB. Below table shows the result (average of 5 runs) and Standard
Deviation (of 5 runs) for each block-size.

===============================================
block-size  |    Average time (ms)   |    Standard Deviation (ms)
-------------------------------------------------------------------------------------
8kb            |    12580.879 ms         |    144.6923467
16kb          |    12442.7256 ms       |    94.02799006
32kb          |    12370.7292 ms       |    97.7958552
64kb          |    11877.4888 ms       |    222.2419142
128kb        |    11828.8568 ms       |    129.732941
256kb        |    11801.086 ms         |    20.60030913
512kb        |    12361.4172 ms       |    65.27390105
1MB          |    12343.3732 ms       |    80.84427202
2MB          |    12357.675 ms         |    79.40017604
4MB          |    12395.8364 ms       |    76.78273689
8MB          |    11712.8862 ms       |    50.74323039
==============================================

>From the results, I think there is a small regression for small block size.

I ran the tests in git bash. I have also attached the test script.

Thanks and Regards,
Shlok Kyal

#!/bin/bash

if [ -z "$1" ]
then
    size="8kB"
else
    size=$1
fi

echo 'Clean up'
echo $size
./pg_ctl stop -D data

rm -rf data* logfile

echo 'Set up'

./initdb -D data -U postgres

cat << EOF >> data/postgresql.conf
wal_level = logical
autovacuum = false
checkpoint_timeout = 1h
shared_buffers = '10GB'
work_mem = '1GB'
logical_decoding_work_mem = '2097151 kB'
max_wal_size = 20GB
min_wal_size = 10GB
rb_mem_block_size = $size
EOF

./pg_ctl -D data start -w -l logfile

(
    echo -E "SELECT * FROM pg_create_logical_replication_slot('test', 'test_decoding');"
    echo -E "CREATE TABLE foo (id int);"
    echo -E "INSERT INTO foo VALUES (generate_series(1, 10000000));"
) | ./psql -U postgres

for i in `seq 1 5`
do
(
    echo -E "\timing"
    echo -E "SELECT count(*) FROM pg_logical_slot_peek_changes('test', NULL, NULL);"
) | ./psql -U postgres
done

Re: Using per-transaction memory contexts for storing decoded tuples

Reply via email to