Hi, attached is proposal idea by Tomas (in CC) for protecting and prioritizing OLTP latency on syncrep over other heavy WAL hitting sessions. This is the result of internal testing and research related to the syncrep behavior with Tomas, Alvaro and me. The main objective of this work-in-progress/crude patch idea (with GUC default disabled) is to prevent build-up of lag against the synchronous standby. It allows DBA to maintain stable latency for many backends when in parallel there is some WAL-heavy activity happening (full table rewrite, VACUUM, MV build, archiving, etc.). In other words it allows slow down of any backend activity. Any feedback on such a feature is welcome, including better GUC name proposals ;) and conditions in which such feature should be disabled even if it would be enabled globally (right now only anti- wraparound VACUUM comes to mind, it's not in the patch).
Demo; Given: - two DBs in syncrep configuration and artificial RTT 10ms latency between them (introduced via tc qdisc netem) - insertBIG.sql = "insert into bandwidthhog select repeat('c', 1000) from generate_series(1, 500000);" (50MB of WAL data) - pgbench (8c) and 1x INSERT session There are clearly visible drops of pgbench (OLTP) latency when the WAL socket is saturated: with 16devel/master and synchronous_commit_flush_wal_after=0 (disabled, default/baseline): postgres@host1:~$ pgbench -n -R 50 -c 8 -T 15 -P 1 pgbench (16devel) progress: 1.0 s, 59.0 tps, lat 18.840 ms stddev 11.251, 0 failed, lag 0.059 ms progress: 2.0 s, 48.0 tps, lat 14.332 ms stddev 4.272, 0 failed, lag 0.063 ms progress: 3.0 s, 56.0 tps, lat 15.383 ms stddev 6.270, 0 failed, lag 0.061 ms progress: 4.0 s, 51.0 tps, lat 15.104 ms stddev 5.850, 0 failed, lag 0.061 ms progress: 5.0 s, 47.0 tps, lat 15.184 ms stddev 5.472, 0 failed, lag 0.063 ms progress: 6.0 s, 23.0 tps, lat 88.495 ms stddev 141.845, 0 failed, lag 0.064 ms progress: 7.0 s, 1.0 tps, lat 999.053 ms stddev 0.000, 0 failed, lag 0.077 ms progress: 8.0 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed, lag 0.000 ms progress: 9.0 s, 1.0 tps, lat 2748.142 ms stddev NaN, 0 failed, lag 0.072 ms progress: 10.0 s, 68.1 tps, lat 3368.267 ms stddev 282.842, 0 failed, lag 2911.857 ms progress: 11.0 s, 97.0 tps, lat 2560.750 ms stddev 216.844, 0 failed, lag 2478.261 ms progress: 12.0 s, 96.0 tps, lat 1463.754 ms stddev 376.276, 0 failed, lag 1383.873 ms progress: 13.0 s, 94.0 tps, lat 616.243 ms stddev 230.673, 0 failed, lag 527.241 ms progress: 14.0 s, 59.0 tps, lat 48.265 ms stddev 72.533, 0 failed, lag 15.181 ms progress: 15.0 s, 39.0 tps, lat 14.237 ms stddev 6.073, 0 failed, lag 0.063 ms transaction type: <builtin: TPC-B (sort of)> [..] latency average = 931.383 ms latency stddev = 1188.530 ms rate limit schedule lag: avg 840.170 (max 3605.569) ms session2 output: postgres=# \i insertBIG.sql Timing is on. INSERT 0 500000 Time: 4119.485 ms (00:04.119) This new GUC makes it possible for the OLTP traffic to be less affected (latency-wise) when the heavy bulk traffic hits. With synchronous_commit_flush_wal_after=1024 (kB) it's way better, but latency rises up to 45ms: postgres@host1:~$ pgbench -n -R 50 -c 8 -T 15 -P 1 pgbench (16devel) progress: 1.0 s, 52.0 tps, lat 17.300 ms stddev 10.178, 0 failed, lag 0.061 ms progress: 2.0 s, 51.0 tps, lat 19.490 ms stddev 12.626, 0 failed, lag 0.061 ms progress: 3.0 s, 48.0 tps, lat 14.839 ms stddev 5.429, 0 failed, lag 0.061 ms progress: 4.0 s, 53.0 tps, lat 24.635 ms stddev 13.449, 0 failed, lag 0.062 ms progress: 5.0 s, 48.0 tps, lat 17.999 ms stddev 9.291, 0 failed, lag 0.062 ms progress: 6.0 s, 57.0 tps, lat 21.513 ms stddev 17.011, 0 failed, lag 0.058 ms progress: 7.0 s, 50.0 tps, lat 28.071 ms stddev 21.622, 0 failed, lag 0.061 ms progress: 8.0 s, 45.0 tps, lat 27.244 ms stddev 11.975, 0 failed, lag 0.064 ms progress: 9.0 s, 57.0 tps, lat 35.988 ms stddev 25.752, 0 failed, lag 0.057 ms progress: 10.0 s, 56.0 tps, lat 45.478 ms stddev 39.831, 0 failed, lag 0.534 ms progress: 11.0 s, 62.0 tps, lat 45.146 ms stddev 32.881, 0 failed, lag 0.058 ms progress: 12.0 s, 51.0 tps, lat 24.250 ms stddev 12.405, 0 failed, lag 0.063 ms progress: 13.0 s, 57.0 tps, lat 18.554 ms stddev 8.677, 0 failed, lag 0.060 ms progress: 14.0 s, 44.0 tps, lat 15.923 ms stddev 6.958, 0 failed, lag 0.065 ms progress: 15.0 s, 54.0 tps, lat 19.773 ms stddev 10.024, 0 failed, lag 0.063 ms transaction type: <builtin: TPC-B (sort of)> [..] latency average = 25.575 ms latency stddev = 21.540 ms session2 output: postgres=# set synchronous_commit_flush_wal_after = 1024; SET postgres=# \i insertBIG.sql INSERT 0 500000 Time: 8889.318 ms (00:08.889) With 16devel/master and synchronous_commit_flush_wal_after=256 (kB) all is smooth: postgres@host1:~$ pgbench -n -R 50 -c 8 -T 15 -P 1 pgbench (16devel) progress: 1.0 s, 49.0 tps, lat 14.345 ms stddev 4.700, 0 failed, lag 0.062 ms progress: 2.0 s, 45.0 tps, lat 14.812 ms stddev 5.816, 0 failed, lag 0.064 ms progress: 3.0 s, 49.0 tps, lat 13.145 ms stddev 4.320, 0 failed, lag 0.063 ms progress: 4.0 s, 44.0 tps, lat 14.429 ms stddev 4.715, 0 failed, lag 0.063 ms progress: 5.0 s, 49.0 tps, lat 18.111 ms stddev 8.536, 0 failed, lag 0.062 ms progress: 6.0 s, 58.0 tps, lat 17.929 ms stddev 8.198, 0 failed, lag 0.060 ms progress: 7.0 s, 65.0 tps, lat 20.186 ms stddev 12.973, 0 failed, lag 0.059 ms progress: 8.0 s, 47.0 tps, lat 16.174 ms stddev 6.508, 0 failed, lag 0.061 ms progress: 9.0 s, 45.0 tps, lat 14.485 ms stddev 4.736, 0 failed, lag 0.061 ms progress: 10.0 s, 53.0 tps, lat 16.879 ms stddev 8.783, 0 failed, lag 0.061 ms progress: 11.0 s, 42.0 tps, lat 13.711 ms stddev 4.464, 0 failed, lag 0.062 ms progress: 12.0 s, 49.0 tps, lat 13.252 ms stddev 4.082, 0 failed, lag 0.062 ms progress: 13.0 s, 48.0 tps, lat 14.179 ms stddev 6.238, 0 failed, lag 0.058 ms progress: 14.0 s, 43.0 tps, lat 12.210 ms stddev 2.993, 0 failed, lag 0.060 ms progress: 15.0 s, 34.0 tps, lat 14.811 ms stddev 6.544, 0 failed, lag 0.062 ms transaction type: <builtin: TPC-B (sort of)> [..] latency average = 15.454 ms latency stddev = 7.354 ms session2 output (notice the INSERT took much longer but did NOT affect the pgbench's stddev at all): postgres=# set synchronous_commit_flush_wal_after = 256; SET postgres=# \i insertBIG.sql Timing is on. [..] INSERT 0 500000 Time: 22737.729 ms (00:22.738) Without this feature (or with synchronous_commit_flush_wal_after=0) the TCP's SendQ on socket walsender-->walreceiver is growing and as such any next sendto() by OLTP backends/walwriter ends being queued too much causing stalls of activity. -Jakub Wartak.
0001-GUC-synchronous_commit_flush_wal_after.patch
Description: Binary data