Re: Changes to recovery_min_apply_delay are ignored while waiting for delay

Soumyadeep Chakraborty Fri, 13 Aug 2021 18:00:16 -0700

Hey Michael,

Really appreciate the review!

On Wed, Aug 11, 2021 at 12:40 AM Michael Paquier <mich...@paquier.xyz> wrote:

> Agreed that the current behavior is confusing.  As you are using the
> commit timestamp for the comparison, this is right.  One small-ish
> comment I have about the code is that we should mention
> recovery_min_apply_delay for HandleStartupProcInterrupts(), and not
> only the trigger file location.

Cool, updated.

> +# First, set the delay for the next commit to some obscenely high value.
> It has no need to be obscenely high, just high enough to give the time
> to slow machines to catch the wait event lookup done.  So this could
> use better words to explain this choice.

Sounds good. Done.

> Anyway, it seems to me that something is incorrect in this new test
> (manual tests pass here, the TAP test is off).  One thing we need to
> make sure for any new test added is that it correctly breaks if the
> fix proposed is *not* in place.  And as far as I can see, the test
> passes even if the recalculation of delayUntil is done within the loop
> that reloads the configuration.  The thing may be a bit tricky to make
> reliable as the parameter reloading may cause wait_event to not be
> RecoveryApplyDelay, so we should have at least a check based on a scan
> of pg_stat_replication.replay_lsn on the primary.  Perhaps you have
> an other idea?

Hmm, please see attached v4 which just shifts the test to the middle,
like v1. When I run the test without the code change, the test hangs
as expected in:

# Now the standby should catch up.
$node_primary->wait_for_catchup('standby', 'replay');

and passes with the code change, as expected. I can't explain why the
test doesn't freeze up in v3 in wait_for_catchup() at the end.

As for checking on the wait event, since we only signal the standby
after performing the check, that should be fine. Nothing else would be
sending a SIGHUP before the check. Is that assumption correct?

> When using wait_for_catchup(), I would recommend to list "replay" for
> this test, even if that's the default mode, to make clear what the
> intention is.

Done.

Regards,
Soumyadeep (VMware)

From 824076a977af0e40da1c7eb9e4aeac9a8e81a7ee Mon Sep 17 00:00:00 2001
From: Soumyadeep Chakraborty <soumyadeep2...@gmail.com>
Date: Mon, 2 Aug 2021 20:50:37 -0700
Subject: [PATCH v4 1/1] Refresh delayUntil in recovery delay loop

This ensures that the wait interval in the loop is correctly
recalculated with the updated value of recovery_min_apply_delay.

Now, if one changes the GUC while we are waiting, those changes will
take effect. Practical applications include instantly cancelling a long
wait time by setting recovery_min_apply_delay to 0. This can be useful
in tests.
---
 src/backend/access/transam/xlog.c       | 13 +++++---
 src/test/recovery/t/005_replay_delay.pl | 41 ++++++++++++++++++++++---
 2 files changed, 46 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d0ec6a834be..74ad7ff905b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6234,12 +6234,11 @@ recoveryApplyDelay(XLogReaderState *record)
 	if (!getRecordTimestamp(record, &xtime))
 		return false;
 
-	delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay);
-
 	/*
 	 * Exit without arming the latch if it's already past time to apply this
 	 * record
 	 */
+	delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay);
 	msecs = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), delayUntil);
 	if (msecs <= 0)
 		return false;
@@ -6248,14 +6247,20 @@ recoveryApplyDelay(XLogReaderState *record)
 	{
 		ResetLatch(&XLogCtl->recoveryWakeupLatch);
 
-		/* might change the trigger file's location */
+		/* might change the trigger file's location and recovery_min_apply_delay */
 		HandleStartupProcInterrupts();
 
 		if (CheckForStandbyTrigger())
 			break;
 
 		/*
-		 * Wait for difference between GetCurrentTimestamp() and delayUntil
+		 * Recalculate delayUntil as recovery_min_apply_delay could have changed
+		 * since last time.
+		 */
+		delayUntil = TimestampTzPlusMilliseconds(xtime, recovery_min_apply_delay);
+
+		/*
+		 * Wait for difference between GetCurrentTimestamp() and delayUntil.
 		 */
 		msecs = TimestampDifferenceMilliseconds(GetCurrentTimestamp(),
 												delayUntil);
diff --git a/src/test/recovery/t/005_replay_delay.pl b/src/test/recovery/t/005_replay_delay.pl
index 0b56380e0a7..ad8093df41a 100644
--- a/src/test/recovery/t/005_replay_delay.pl
+++ b/src/test/recovery/t/005_replay_delay.pl
@@ -7,7 +7,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 3;
+use Test::More tests => 4;
 
 # Initialize primary node
 my $node_primary = PostgresNode->new('primary');
@@ -56,6 +56,39 @@ $node_standby->poll_query_until('postgres',
 ok(time() - $primary_insert_time >= $delay,
 	"standby applies WAL only after replication delay");
 
+# Now test to see if updates to recovery_min_apply_delay apply when a standby is
+# waiting for a recovery delay to elapse.
+
+# First, set the delay for the next commit to a value much higher than the usual
+# poll timeouts, so that replay remains blocked while we are running checks.
+$node_standby->safe_psql('postgres', "ALTER SYSTEM SET recovery_min_apply_delay TO '2h';");
+$node_standby->reload;
+
+$node_primary->safe_psql('postgres',
+	"INSERT INTO tab_int VALUES (generate_series(21, 30))");
+
+# The standby's flush LSN should be caught up thanks to the walreceiver.
+my $last_insert_lsn =
+	$node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn();");
+$node_primary->wait_for_catchup('standby', 'flush', $last_insert_lsn);
+
+# The standby's replay LSN will not be caught up as it should be waiting in
+# recovery_min_apply_delay.
+$node_standby->poll_query_until('postgres', qq{
+SELECT wait_event = 'RecoveryApplyDelay' FROM pg_stat_activity
+WHERE backend_type='startup';
+}) or die "Timed out checking if startup is in recovery_min_apply_delay";
+is( $node_standby->safe_psql('postgres',
+	"SELECT pg_last_wal_replay_lsn() < '$last_insert_lsn'::pg_lsn;"),
+	't', 'standby stuck in recovery_min_apply_delay');
+
+# Clear the recovery_min_apply_delay timeout so that the wait is immediately
+# cancelled and replay can proceed.
+$node_standby->safe_psql('postgres', "ALTER SYSTEM SET recovery_min_apply_delay TO DEFAULT;");
+$node_standby->reload;
+
+# Now the standby replay should catch up.
+$node_primary->wait_for_catchup('standby', 'replay');
 
 # Check that recovery can be paused or resumed expectedly.
 my $node_standby2 = PostgresNode->new('standby2');
@@ -72,7 +105,7 @@ is( $node_standby2->safe_psql(
 # Request to pause recovery and wait until it's actually paused.
 $node_standby2->safe_psql('postgres', "SELECT pg_wal_replay_pause()");
 $node_primary->safe_psql('postgres',
-	"INSERT INTO tab_int VALUES (generate_series(21,30))");
+	"INSERT INTO tab_int VALUES (generate_series(31,40))");
 $node_standby2->poll_query_until('postgres',
 	"SELECT pg_get_wal_replay_pause_state() = 'paused'")
   or die "Timed out while waiting for recovery to be paused";
@@ -84,7 +117,7 @@ my $receive_lsn =
 my $replay_lsn =
   $node_standby2->safe_psql('postgres', "SELECT pg_last_wal_replay_lsn()");
 $node_primary->safe_psql('postgres',
-	"INSERT INTO tab_int VALUES (generate_series(31,40))");
+	"INSERT INTO tab_int VALUES (generate_series(41,50))");
 $node_standby2->poll_query_until('postgres',
 	"SELECT '$receive_lsn'::pg_lsn < pg_last_wal_receive_lsn()")
   or die "Timed out while waiting for new WAL to be streamed";
@@ -102,7 +135,7 @@ $node_standby2->poll_query_until('postgres',
 # is triggered while recovery is paused.
 $node_standby2->safe_psql('postgres', "SELECT pg_wal_replay_pause()");
 $node_primary->safe_psql('postgres',
-	"INSERT INTO tab_int VALUES (generate_series(41,50))");
+	"INSERT INTO tab_int VALUES (generate_series(51,60))");
 $node_standby2->poll_query_until('postgres',
 	"SELECT pg_get_wal_replay_pause_state() = 'paused'")
   or die "Timed out while waiting for recovery to be paused";
-- 
2.25.1

Re: Changes to recovery_min_apply_delay are ignored while waiting for delay

Reply via email to