Here are the performance test results and analysis with the recent patches.
Test Setup: - Created Pub and Sub nodes with logical replication and below configurations. autovacuum_naptime = '30s' shared_buffers = '30GB' max_wal_size = 20GB min_wal_size = 10GB track_commit_timestamp = on (only on Sub node). - Pub and Sub had different pgbench tables with initial data of scale=100. ------------------------------- Case-0: Collected data on pgHead ------------------------------- - Ran pgbench(read-write) on both the publisher and the subscriber with 30 clients for a duration of 15 minutes, collecting data over 3 runs. Results: Run# pub_TPS sub_TPS 1 30551.63471 29476.81709 2 30112.31203 28933.75013 3 29599.40383 28379.4977 Median 30112.31203 28933.75013 ------------------------------- Case-1: Long run(15-minutes) tests when retain_conflitc_info=ON ------------------------------- - Code: pgHead + v19 patches. - At Sub set autovacuum=false. - Ran pgbench(read-write) on both the publisher and the subscriber with 30 clients for a duration of 15 minutes, collecting data over 3 runs. Results: Run# pub_TPS sub_TPS 1 30326.57637 4890.410972 2 30412.85115 4787.192754 3 30860.13879 4864.549117 Median 30412.85115 4864.549117 regression 1% -83% - A 15-minute pgbench run test showed higher reduction in the sub's TPS. As the test run time increased the TPS reduced further at the Sub node. ------------------------------- Case-2 : Re-ran the case-1 with autovacuum enabled and running every 30 seconds. ------------------------------- - Code: pgHead + v19 patches. - At Sub set autovacuum=true. - Also measured the frequency of slot.xmin and the worker's oldest_nonremovable_xid updates. Results: Run# pub_TPS sub_TPS #slot.xmin_updates #worker's_oldest_nonremovable_xid_updates 1 31080.30944 4573.547293 0 1 regression 3% -84% - Autovacuum did not help in improving the Sub's TPS. - The slot's xmin was not advanced. ~~~~ Observations and RCA for TPS reduction in above tests: - The launcher was not able to advance slot.xmin during the 15-minute pgbench run, leading to increased dead tuple accumulation on the subscriber node. - The launcher failed to advance slot.xmin because the apply worker could not set the oldest_nonremovable_xid early and frequently enough due to following two reasons - 1) For large pgbench tables (scale=100), the tablesync takes time to complete, forcing the apply worker to wait before updating its oldest_nonremovable_xid. 2) With 30 clients generating operations at a pace that a single apply worker cannot match, the worker fails to catch up with the rapidly increasing remote_lsn, lagging behind the Publisher's LSN throughout the 15-minute run. Considering the above reasons, for better performance measurements, collected data when table_sync is off, with a varying number of clients on the publisher node. Below test used the v21 patch set, which also includes improvement patches (006 and 007) for more frequent slot.xmin updates. ------------------------------- Case-3: Create the subscription with option "copy_data=false", so, no tablesync in the picture. ------------------------------- Test setup: - Code: pgHead + v21 patches. - Created Pub and Sub nodes with logical replication and below configurations. autovacuum_naptime = '30s' shared_buffers = '30GB' max_wal_size = 20GB min_wal_size = 10GB track_commit_timestamp = on (only on Sub node). - The Pub and Sub had different pgbench tables with initial data of scale=100. - Ran pgbench(read-write) on both the pub and the sub for a duration of 15 minutes, using 30 clients on the Subscriber while varying the number of clients on the Publisher. - In addition to TPS, the frequency of slot.xmin and the worker's oldest_nonremovable_xid updates was also measured. Observations: - As the number of clients on the publisher increased, the publisher's TPS improved, but the subscriber's TPS dropped significantly. - The frequency of slot.xmin updates also declined with more clients on the publisher, indicating that the apply worker updated its oldest_nonremovable_xid less frequently as the read-write operations on the publisher increased. Results: #Pub-clients pubTPS pubTPS_increament subTPS pubTPS_reduction #slot.xmin_updates #worker's_oldest_nonremovable_xid_updates 1 1364.487898 0 35000.06738 0 6976 6977 2 2706.100445 98% 32297.81408 -8% 5838 5839 4 5079.522778 272% 8581.034791 -75% 268 269 30 31308.18524 2195% 5324.328696 -85% 4 5 Note: In the above result table, the column - - "PubTPS_increment" represents the % improvement in the Pub's TPS compared to its TPS in the initial run with #Pub-clients=1 and - "SubTPS_reduction" indicates the % decrease in the Sub's TPS compared to its TPS in the initial run with #Pub-clients=1. ~~~~ Conclusion: There is some improvement in slot.xmin update frequency with table_sync off and the additional patches that updates slot's xmin aggressively. However, the key point is that with a large number of clients generating write operations, apply worker LAGs with a large margin leading to non-updation of slot's xmin as the test run time increases. This is also visible [in case-3] that with only 1 client on publisher, there is no degradation on the subscriber. As the number of clients increases, the degradation also increases. Based on this test analysis I can say that we need some way/option to invalidate such slots that LAG by a threshold margin, as mentioned at [1]. This should solve the performance degradation and bloat problem. ~~~~ (Attached the test scripts used for above tests) [1] https://www.postgresql.org/message-id/CAA4eK1Jyo4odkVsnSeAWPh8Wgpw12EbS9q8s_eN14LtcFNXCSA%40mail.gmail.com -- Thanks, Nisha
#!/bin/bash ################## ### Definition ### ################## port_pub=6633 port_sub=6634 ## prefix PUB_PREFIX="$HOME/project/pg1/postgres/inst/bin" ## scale factor SCALE=100 ## pgbench init command INIT_COMMAND="pgbench -i -U postgres postgres -s $SCALE" SOURCE=$1 ################ ### clean up ### ################ ./pg_ctl stop -D data_pub -w ./pg_ctl stop -D data_sub -w rm -rf data* *log ####################### ### setup publisher ### ####################### ./initdb -D data_pub -U postgres cat << EOF >> data_pub/postgresql.conf port=$port_pub # autovacuum = false shared_buffers = '30GB' max_wal_size = 20GB min_wal_size = 10GB wal_level = logical EOF ./pg_ctl -D data_pub start -w -l pub.log ${PUB_PREFIX}/$INIT_COMMAND -p $port_pub ./psql -U postgres -p $port_pub -c "CREATE PUBLICATION pub FOR ALL TABLES;" ####################### ### setup sublisher ### ####################### ./initdb -D data_sub -U postgres cat << EOF >> data_sub/postgresql.conf port=$port_sub # autovacuum = false autovacuum_naptime = '30s' shared_buffers = '30GB' max_wal_size = 20GB min_wal_size = 10GB track_commit_timestamp = on #log_min_messages = DEBUG1 EOF ./pg_ctl -D data_sub start -w -l sub.log ./$INIT_COMMAND -p $port_sub ( echo "CREATE TABLE pgbench_pub_history (tid int,bid int,aid bigint,delta int,mtime timestamp,filler char(22));" echo "CREATE TABLE pgbench_pub_tellers (tid int not null primary key,bid int,tbalance int,filler char(84));" echo "CREATE TABLE pgbench_pub_accounts (aid bigint not null primary key,bid int,abalance int,filler char(84));" echo "CREATE TABLE pgbench_pub_branches (bid int not null primary key,bbalance int,filler char(88));" ) | ./psql -p $port_sub -U postgres if [ $SOURCE = "head" ] then ./psql -U postgres -p $port_sub -c "CREATE SUBSCRIPTION sub CONNECTION 'port=6633 user=postgres' PUBLICATION pub;" else ./psql -U postgres -p $port_sub -c "CREATE SUBSCRIPTION sub CONNECTION 'port=6633 user=postgres' PUBLICATION pub WITH (retain_conflict_info = on);" fi # Wait until all the table sync is done REMAIN="f" while [ "$REMAIN" = "f" ] do # Sleep a bit to avoid running the query too much sleep 1s # Check pg_subscription_rel catalog. This query is ported from wait_for_subscription_sync() # defined in Cluster.pm. REMAIN=`./psql -qtA -U postgres -p $port_sub -c "SELECT count(1) = 0 FROM pg_subscription_rel WHERE srsubstate NOT IN ('r', 's');"` # Print the result for the debugging purpose echo $REMAIN done
#!/bin/bash ################## ### Definition ### ################## #export PATH="$HOME/project/pg1/postgres/inst/bin:$PATH" port_pub=6633 port_sub=6634 ## prefix PUB_PREFIX="$HOME/project/pg1/postgres/inst/bin" ## Used source SOURCE=v21_tswait ## Number of runs NUMRUN=1 ## Measurement duration DURATION=900 ## Number of clients during a run NUMCLIENTS=30 ########################### ### measure performance ### ########################### for i in `seq ${NUMRUN}` do # Prepare clean enviroment for each measurements #./v2_setup_n.sh $SOURCE sh $HOME/project/update_deleted/perf_test_v21/v21_case1_setup.sh $SOURCE echo "==================" echo "${SOURCE}_${i}.dat" echo "==================" # Do actual measurements ${PUB_PREFIX}/pgbench -p $port_pub -U postgres postgres -c $NUMCLIENTS -j $NUMCLIENTS -T $DURATION -P 30 > pub_${SOURCE}_${i}.dat & ./pgbench -p $port_sub -U postgres postgres -c $NUMCLIENTS -j $NUMCLIENTS -T $DURATION -P 30 > sub_${SOURCE}_${i}.dat echo "==================" sleep 10s done