Re: Race condition in recovery?

Dilip Kumar Fri, 04 Jun 2021 00:51:42 -0700

On Fri, Jun 4, 2021 at 2:03 AM Robert Haas <[email protected]> wrote:
>
> On Thu, May 27, 2021 at 2:26 AM Dilip Kumar <[email protected]> wrote:
> > Changed as suggested.
>
> I don't think the code as written here is going to work on Windows,
> because your code doesn't duplicate enable_restoring's call to
> perl2host or its backslash-escaping logic. It would really be better
> if we could use enable_restoring directly. Also, I discovered that the
> 'return' in cp_history_files should really say 'exit', because
> otherwise it generates a complaint every time it's run. It should also
> have 'use strict' and 'use warnings' at the top.


Ok

> Here's a version of your test case patch with the 1-line code fix
> added, the above issues addressed, and a bunch of cosmetic tweaks.
> Unfortunately, it doesn't pass for me consistently. I'm not sure if
> that's because I broke something with my changes, or because the test
> contains an underlying race condition which we need to address.
> Attached also are the log files from a failed run if you want to look
> at them. The key lines seem to be:

I could not reproduce this but I think I got the issue, I think I used
the wrong target LSN in wait_for_catchup, instead of checking the last
"insert LSN" of the standby I was waiting for last "replay LSN" of
standby which was wrong.  Changed as below in the attached patch.

diff --git a/src/test/recovery/t/025_stuck_on_old_timeline.pl
b/src/test/recovery/t/025_stuck_on_old_timeline.pl
index 09eb3eb..ee7d78d 100644
--- a/src/test/recovery/t/025_stuck_on_old_timeline.pl
+++ b/src/test/recovery/t/025_stuck_on_old_timeline.pl
@@ -78,7 +78,7 @@ $node_standby->safe_psql('postgres', "CREATE TABLE
tab_int AS SELECT 1 AS a");

 # Wait for the replication to catch up
 $node_standby->wait_for_catchup($node_cascade, 'replay',
-       $node_standby->lsn('replay'));
+       $node_standby->lsn('insert'));

 # Check that cascading standby has the new content
 my $result =

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

From 7e7b5cad099b1e1d554581276481ef729fc87fbc Mon Sep 17 00:00:00 2001
From: Robert Haas <[email protected]>
Date: Thu, 3 Jun 2021 16:06:57 -0400
Subject: [PATCH v5] Fix corner case failure of new standby to follow new
 primary.

This only happens if (1) the new standby has no WAL available locally,
(2) the new standby is starting from the old timeline, (3) the promotion
happened in the WAL segment from which the new standby is starting,
(4) the timeline history file for the new timeline is available from
the archive but the WAL files for are not (i.e. this is a race),
(5) the WAL files for the new timeline are available via streaming,
and (6) recovery_target_timeline='latest'.

Commit ee994272ca50f70b53074f0febaec97e28f83c4e introduced this
logic and was an improvement over the previous code, but it mishandled
this case. If recovery_target_timeline='latest' and restore_command is
set, validateRecoveryParameters() can change recoveryTargetTLI to be
different from receiveTLI. If streaming is then tried afterward,
expectedTLEs gets initialized with the history of the wrong timeline.
It's supposed to be a list of entries explaining how to get to the
target timeline, but in this case it ends up with a list of entries
explaining how to get to the new standby's original timeline, which
isn't right.

Dilip Kumar and Robert Haas, with input from Kyotaro Horiguchi.

Discussion: http://postgr.es/m/CAFiTN-sE-jr=lb8jquxeqikd-ux+jhixyh4ydizmpedgqku...@mail.gmail.com
---
 src/test/recovery/t/025_stuck_on_old_timeline.pl | 92 ++++++++++++++++++++++++
 src/test/recovery/t/cp_history_files             | 10 +++
 2 files changed, 102 insertions(+)
 create mode 100644 src/test/recovery/t/025_stuck_on_old_timeline.pl
 create mode 100755 src/test/recovery/t/cp_history_files

diff --git a/src/test/recovery/t/025_stuck_on_old_timeline.pl b/src/test/recovery/t/025_stuck_on_old_timeline.pl
new file mode 100644
index 0000000..ee7d78d
--- /dev/null
+++ b/src/test/recovery/t/025_stuck_on_old_timeline.pl
@@ -0,0 +1,92 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Testing streaming replication where standby is promoted and a new cascading
+# standby (without WAL) is connected to the promoted standby.  Both archiving
+# and streaming are enabled, but only the history file is available from the
+# archive, so the WAL files all have to be streamed.  Test that the cascading
+# standby can follow the new primary (promoted standby).
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use FindBin;
+use Test::More tests => 1;
+
+# Initialize primary node
+my $node_primary = get_new_node('primary');
+
+# Set up an archive command that will copy the history file but not the WAL
+# files. No real archive command should behave this way; the point is to
+# simulate a race condition where the new cascading standby starts up after
+# the timeline history file reaches the archive but before any of the WAL files
+# get there.
+$node_primary->init(allows_streaming => 1, has_archiving => 1);
+my $archivedir_primary = $node_primary->archive_dir;
+$node_primary->append_conf(
+	'postgresql.conf', qq(
+archive_command = '"$FindBin::RealBin/cp_history_files" "%p" "$archivedir_primary/%f"'
+));
+$node_primary->start;
+
+# Take backup from primary
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby = get_new_node('standby');
+$node_standby->init_from_backup($node_primary, $backup_name,
+	allows_streaming => 1, has_streaming => 1, has_archiving => 1);
+$node_standby->start;
+
+# Take backup of standby, use -Xnone so that pg_wal is empty.
+$node_standby->backup($backup_name, backup_options => ['-Xnone']);
+
+# Create cascading standby but don't start it yet.
+# Must set up both streaming and archiving.
+my $node_cascade = get_new_node('cascade');
+$node_cascade->init_from_backup($node_standby, $backup_name,
+	has_streaming => 1);
+$node_cascade->enable_restoring($node_primary);
+
+# Promote the standby.
+$node_standby->psql('postgres', 'SELECT pg_promote()');
+
+# Find next WAL segment to be archived
+my $walfile_to_be_archived = $node_standby->safe_psql('postgres',
+	"SELECT pg_walfile_name(pg_current_wal_lsn());");
+
+# Make WAL segment eligible for archival
+$node_standby->safe_psql('postgres', 'SELECT pg_switch_wal()');
+
+# Wait until the WAL segment has been archived.
+# Since the history file gets created on promotion and is archived before any
+# WAL segment, this is enough to guarantee that the history file was
+# archived.
+my $archive_wait_query =
+  "SELECT '$walfile_to_be_archived' <= last_archived_wal FROM pg_stat_archiver;";
+$node_standby->poll_query_until('postgres', $archive_wait_query)
+  or die "Timed out while waiting for WAL segment to be archived";
+my $last_archived_wal_file = $walfile_to_be_archived;
+
+# Start cascade node
+$node_cascade->start;
+
+# Create some content on promoted standby and check its presence on the
+# cascading standby.
+$node_standby->safe_psql('postgres', "CREATE TABLE tab_int AS SELECT 1 AS a");
+
+# Wait for the replication to catch up
+$node_standby->wait_for_catchup($node_cascade, 'replay',
+	$node_standby->lsn('insert'));
+
+# Check that cascading standby has the new content
+my $result =
+  $node_cascade->safe_psql('postgres', "SELECT count(*) FROM tab_int");
+print "cascade: $result\n";
+is($result, 1, 'check streamed content on cascade standby');
+
+# clean up
+$node_primary->teardown_node;
+$node_standby->teardown_node;
+$node_cascade->teardown_node;
diff --git a/src/test/recovery/t/cp_history_files b/src/test/recovery/t/cp_history_files
new file mode 100755
index 0000000..cfeea41
--- /dev/null
+++ b/src/test/recovery/t/cp_history_files
@@ -0,0 +1,10 @@
+#!/usr/bin/perl
+
+use File::Copy;
+use strict;
+use warnings;
+
+die "wrong number of arguments" if @ARGV != 2;
+my ($source, $target) = @ARGV;
+exit if $source !~ /history/;
+copy($source, $target) or die "couldn't copy $source to $target: $!";
-- 
1.8.3.1

Re: Race condition in recovery?

Reply via email to