Hey Michael, Really appreciate the review!
On Wed, Aug 11, 2021 at 12:40 AM Michael Paquier <mich...@paquier.xyz> wrote: > Agreed that the current behavior is confusing. As you are using the > commit timestamp for the comparison, this is right. One small-ish > comment I have about the code is that we should mention > recovery_min_apply_delay for HandleStartupProcInterrupts(), and not > only the trigger file location. Cool, updated. > +# First, set the delay for the next commit to some obscenely high value. > It has no need to be obscenely high, just high enough to give the time > to slow machines to catch the wait event lookup done. So this could > use better words to explain this choice. Sounds good. Done. > Anyway, it seems to me that something is incorrect in this new test > (manual tests pass here, the TAP test is off). One thing we need to > make sure for any new test added is that it correctly breaks if the > fix proposed is *not* in place. And as far as I can see, the test > passes even if the recalculation of delayUntil is done within the loop > that reloads the configuration. The thing may be a bit tricky to make > reliable as the parameter reloading may cause wait_event to not be > RecoveryApplyDelay, so we should have at least a check based on a scan > of pg_stat_replication.replay_lsn on the primary. Perhaps you have > an other idea? Hmm, please see attached v4 which just shifts the test to the middle, like v1. When I run the test without the code change, the test hangs as expected in: # Now the standby should catch up. $node_primary->wait_for_catchup('standby', 'replay'); and passes with the code change, as expected. I can't explain why the test doesn't freeze up in v3 in wait_for_catchup() at the end. As for checking on the wait event, since we only signal the standby after performing the check, that should be fine. Nothing else would be sending a SIGHUP before the check. Is that assumption correct? > When using wait_for_catchup(), I would recommend to list "replay" for > this test, even if that's the default mode, to make clear what the > intention is. Done. Regards, Soumyadeep (VMware)
From 824076a977af0e40da1c7eb9e4aeac9a8e81a7ee Mon Sep 17 00:00:00 2001 From: Soumyadeep Chakraborty <soumyadeep2...@gmail.com> Date: Mon, 2 Aug 2021 20:50:37 -0700 Subject: [PATCH v4 1/1] Refresh delayUntil in recovery delay loop This ensures that the wait interval in the loop is correctly recalculated with the updated value of recovery_min_apply_delay. Now, if one changes the GUC while we are waiting, those changes will take effect. Practical applications include instantly cancelling a long wait time by setting recovery_min_apply_delay to 0. This can be useful in tests. --- src/backend/access/transam/xlog.c | 13 +++++--- src/test/recovery/t/005_replay_delay.pl | 41 ++++++++++++++++++++++--- 2 files changed, 46 insertions(+), 8 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index d0ec6a834be..74ad7ff905b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6234,12 +6234,11 @@ recoveryApplyDelay(XLogReaderState *record) if (!getRecordTimestamp(record, &xtime)) return false; - delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay); - /* * Exit without arming the latch if it's already past time to apply this * record */ + delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay); msecs = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil); if (msecs <= 0) return false; @@ -6248,14 +6247,20 @@ recoveryApplyDelay(XLogReaderState *record) { ResetLatch(&XLogCtl->recoveryWakeupLatch); - /* might change the trigger file's location */ + /* might change the trigger file's location and recovery_min_apply_delay */ HandleStartupProcInterrupts(); if (CheckForStandbyTrigger()) break; /* - * Wait for difference between GetCurrentTimestamp() and delayUntil + * Recalculate delayUntil as recovery_min_apply_delay could have changed + * since last time. + */ + delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay); + + /* + * Wait for difference between GetCurrentTimestamp() and delayUntil. */ msecs = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil); diff --git a/src/test/recovery/t/005_replay_delay.pl b/src/test/recovery/t/005_replay_delay.pl index 0b56380e0a7..ad8093df41a 100644 --- a/src/test/recovery/t/005_replay_delay.pl +++ b/src/test/recovery/t/005_replay_delay.pl @@ -7,7 +7,7 @@ use warnings; use PostgresNode; use TestLib; -use Test::More tests => 3; +use Test::More tests => 4; # Initialize primary node my $node_primary = PostgresNode->new('primary'); @@ -56,6 +56,39 @@ $node_standby->poll_query_until('postgres', ok(time() - $primary_insert_time >= $delay, "standby applies WAL only after replication delay"); +# Now test to see if updates to recovery_min_apply_delay apply when a standby is +# waiting for a recovery delay to elapse. + +# First, set the delay for the next commit to a value much higher than the usual +# poll timeouts, so that replay remains blocked while we are running checks. +$node_standby->safe_psql('postgres', "ALTER SYSTEM SET recovery_min_apply_delay TO '2h';"); +$node_standby->reload; + +$node_primary->safe_psql('postgres', + "INSERT INTO tab_int VALUES (generate_series(21, 30))"); + +# The standby's flush LSN should be caught up thanks to the walreceiver. +my $last_insert_lsn = + $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn();"); +$node_primary->wait_for_catchup('standby', 'flush', $last_insert_lsn); + +# The standby's replay LSN will not be caught up as it should be waiting in +# recovery_min_apply_delay. +$node_standby->poll_query_until('postgres', qq{ +SELECT wait_event = 'RecoveryApplyDelay' FROM pg_stat_activity +WHERE backend_type='startup'; +}) or die "Timed out checking if startup is in recovery_min_apply_delay"; +is( $node_standby->safe_psql('postgres', + "SELECT pg_last_wal_replay_lsn() < '$last_insert_lsn'::pg_lsn;"), + 't', 'standby stuck in recovery_min_apply_delay'); + +# Clear the recovery_min_apply_delay timeout so that the wait is immediately +# cancelled and replay can proceed. +$node_standby->safe_psql('postgres', "ALTER SYSTEM SET recovery_min_apply_delay TO DEFAULT;"); +$node_standby->reload; + +# Now the standby replay should catch up. +$node_primary->wait_for_catchup('standby', 'replay'); # Check that recovery can be paused or resumed expectedly. my $node_standby2 = PostgresNode->new('standby2'); @@ -72,7 +105,7 @@ is( $node_standby2->safe_psql( # Request to pause recovery and wait until it's actually paused. $node_standby2->safe_psql('postgres', "SELECT pg_wal_replay_pause()"); $node_primary->safe_psql('postgres', - "INSERT INTO tab_int VALUES (generate_series(21,30))"); + "INSERT INTO tab_int VALUES (generate_series(31,40))"); $node_standby2->poll_query_until('postgres', "SELECT pg_get_wal_replay_pause_state() = 'paused'") or die "Timed out while waiting for recovery to be paused"; @@ -84,7 +117,7 @@ my $receive_lsn = my $replay_lsn = $node_standby2->safe_psql('postgres', "SELECT pg_last_wal_replay_lsn()"); $node_primary->safe_psql('postgres', - "INSERT INTO tab_int VALUES (generate_series(31,40))"); + "INSERT INTO tab_int VALUES (generate_series(41,50))"); $node_standby2->poll_query_until('postgres', "SELECT '$receive_lsn'::pg_lsn < pg_last_wal_receive_lsn()") or die "Timed out while waiting for new WAL to be streamed"; @@ -102,7 +135,7 @@ $node_standby2->poll_query_until('postgres', # is triggered while recovery is paused. $node_standby2->safe_psql('postgres', "SELECT pg_wal_replay_pause()"); $node_primary->safe_psql('postgres', - "INSERT INTO tab_int VALUES (generate_series(41,50))"); + "INSERT INTO tab_int VALUES (generate_series(51,60))"); $node_standby2->poll_query_until('postgres', "SELECT pg_get_wal_replay_pause_state() = 'paused'") or die "Timed out while waiting for recovery to be paused"; -- 2.25.1