Re: Race condition in recovery?

Dilip Kumar Tue, 08 Jun 2021 01:47:41 -0700

On Tue, Jun 8, 2021 at 11:13 AM Dilip Kumar <[email protected]> wrote:
>
> # Wait until the node exits recovery.
> $standby->poll_query_until('postgres', "SELECT pg_is_in_recovery() = 'f';")
> or die "Timed out while waiting for promotion";
>
> I will try to generate a version for 9.6 based on this idea and see how it 
> goes


I have changed for as per 9.6 but I am seeing some crash (both
with/without fix), I could not figure out the reason, it did not
generate any core dump, although I changed pg_ctl in PostgresNode.pm
to use "-c" so that it can generate core but it did not generate any
core file.

This is log from cascading node (025_stuck_on_old_timeline_cascade.log)
-------------
cp: cannot stat
‘/home/dilipkumar/work/PG/postgresql/src/test/recovery/tmp_check/data_primary_52dW/archives/000000010000000000000003’:
No such file or directory
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
FATAL:  could not receive database system identifier and timeline ID
from the primary server: server closed the connection unexpectedly
                This probably means the server terminated abnormally
                before or while processing the request.
--------------

The attached logs are when I ran without a fix.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

From 42886664062e4aae35ee33a6b3a87bc242ec86ad Mon Sep 17 00:00:00 2001
From: Robert Haas <[email protected]>
Date: Fri, 4 Jun 2021 10:55:11 -0400
Subject: [PATCH v6] Fix corner case failure of new standby to follow new
 primary.

This only happens if (1) the new standby has no WAL available locally,
(2) the new standby is starting from the old timeline, (3) the promotion
happened in the WAL segment from which the new standby is starting,
(4) the timeline history file for the new timeline is available from
the archive but the WAL files for are not (i.e. this is a race),
(5) the WAL files for the new timeline are available via streaming,
and (6) recovery_target_timeline='latest'.

Commit ee994272ca50f70b53074f0febaec97e28f83c4e introduced this
logic and was an improvement over the previous code, but it mishandled
this case. If recovery_target_timeline='latest' and restore_command is
set, validateRecoveryParameters() can change recoveryTargetTLI to be
different from receiveTLI. If streaming is then tried afterward,
expectedTLEs gets initialized with the history of the wrong timeline.
It's supposed to be a list of entries explaining how to get to the
target timeline, but in this case it ends up with a list of entries
explaining how to get to the new standby's original timeline, which
isn't right.

Dilip Kumar and Robert Haas, reviewed by Kyotaro Horiguchi.

Discussion: http://postgr.es/m/CAFiTN-sE-jr=lb8jquxeqikd-ux+jhixyh4ydizmpedgqku...@mail.gmail.com
---
 src/backend/access/transam/xlog.c                |  2 +-
 src/test/recovery/t/025_stuck_on_old_timeline.pl | 96 ++++++++++++++++++++++++
 src/test/recovery/t/cp_history_files             | 10 +++
 3 files changed, 107 insertions(+), 1 deletion(-)
 create mode 100644 src/test/recovery/t/025_stuck_on_old_timeline.pl
 create mode 100644 src/test/recovery/t/cp_history_files

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9ad6da9..a7ee14b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -11839,7 +11839,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 						if (readFile < 0)
 						{
 							if (!expectedTLEs)
-								expectedTLEs = readTimeLineHistory(receiveTLI);
+								expectedTLEs = readTimeLineHistory(recoveryTargetTLI);
 							readFile = XLogFileRead(readSegNo, PANIC,
 													receiveTLI,
 													XLOG_FROM_STREAM, false);
diff --git a/src/test/recovery/t/025_stuck_on_old_timeline.pl b/src/test/recovery/t/025_stuck_on_old_timeline.pl
new file mode 100644
index 0000000..1eafbbd
--- /dev/null
+++ b/src/test/recovery/t/025_stuck_on_old_timeline.pl
@@ -0,0 +1,96 @@
+
+# Copyright (c) 2021, PostgreSQL Global Development Group
+
+# Testing streaming replication where standby is promoted and a new cascading
+# standby (without WAL) is connected to the promoted standby.  Both archiving
+# and streaming are enabled, but only the history file is available from the
+# archive, so the WAL files all have to be streamed.  Test that the cascading
+# standby can follow the new primary (promoted standby).
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use FindBin;
+use Test::More tests => 1;
+
+# Initialize primary node
+my $node_primary = get_new_node('primary');
+
+# Set up an archive command that will copy the history file but not the WAL
+# files. No real archive command should behave this way; the point is to
+# simulate a race condition where the new cascading standby starts up after
+# the timeline history file reaches the archive but before any of the WAL files
+# get there.
+$node_primary->init(allows_streaming => 1, has_archiving => 1);
+my $archivedir_primary = $node_primary->archive_dir;
+$node_primary->append_conf(
+	'postgresql.conf', qq(
+archive_command = '$^X "$FindBin::RealBin/cp_history_files" "%p" "$archivedir_primary/%f"'
+));
+$node_primary->start;
+
+# Take backup from primary
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+
+# Create streaming standby linking to primary
+my $node_standby = get_new_node('standby');
+$node_standby->init_from_backup($node_primary, $backup_name,
+	allows_streaming => 1, has_streaming => 1, has_archiving => 1);
+$node_standby->start;
+
+# Take backup of standby, use -Xnone so that pg_wal is empty.
+$node_standby->backup($backup_name, backup_options => ['-Xnone']);
+
+# Create cascading standby but don't start it yet.
+# Must set up both streaming and archiving.
+my $node_cascade = get_new_node('cascade');
+$node_cascade->init_from_backup($node_standby, $backup_name,
+	has_streaming => 1);
+$node_cascade->enable_restoring($node_primary);
+
+# Promote the standby.
+$node_standby->promote;
+
+# Wait until the node exits recovery.
+$node_standby->poll_query_until('postgres', "SELECT pg_is_in_recovery() = 'f';")
+or die "Timed out while waiting for promotion";
+
+# Find next WAL segment to be archived
+my $walfile_to_be_archived = $node_standby->safe_psql('postgres',
+	"SELECT pg_xlogfile_name(pg_current_xlog_location());");
+
+# Make WAL segment eligible for archival
+$node_standby->safe_psql('postgres', 'SELECT pg_switch_xlog()');
+
+# Wait until the WAL segment has been archived.
+# Since the history file gets created on promotion and is archived before any
+# WAL segment, this is enough to guarantee that the history file was
+# archived.
+my $archive_wait_query =
+  "SELECT '$walfile_to_be_archived' <= last_archived_wal FROM pg_stat_archiver;";
+$node_standby->poll_query_until('postgres', $archive_wait_query)
+  or die "Timed out while waiting for WAL segment to be archived";
+my $last_archived_wal_file = $walfile_to_be_archived;
+
+# Start cascade node
+$node_cascade->start;
+
+# Create some content on promoted standby and check its presence on the
+# cascading standby.
+$node_standby->safe_psql('postgres', "CREATE TABLE tab_int AS SELECT 1 AS a");
+
+# Wait for the replication to catch up
+$node_standby->wait_for_catchup($node_cascade, 'replay',
+	$node_standby->lsn('insert'));
+
+# Check that cascading standby has the new content
+my $result =
+  $node_cascade->safe_psql('postgres', "SELECT count(*) FROM tab_int");
+print "cascade: $result\n";
+is($result, 1, 'check streamed content on cascade standby');
+
+# clean up
+$node_primary->teardown_node;
+$node_standby->teardown_node;
+$node_cascade->teardown_node;
diff --git a/src/test/recovery/t/cp_history_files b/src/test/recovery/t/cp_history_files
new file mode 100644
index 0000000..cfeea41
--- /dev/null
+++ b/src/test/recovery/t/cp_history_files
@@ -0,0 +1,10 @@
+#!/usr/bin/perl
+
+use File::Copy;
+use strict;
+use warnings;
+
+die "wrong number of arguments" if @ARGV != 2;
+my ($source, $target) = @ARGV;
+exit if $source !~ /history/;
+copy($source, $target) or die "couldn't copy $source to $target: $!";
-- 
1.8.3.1

LOG:  database system was interrupted while in recovery at log time 2021-06-08 14:13:52 IST
HINT:  If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
LOG:  entering standby mode
cp: cannot stat ‘/home/dilipkumar/work/PG/postgresql/src/test/recovery/tmp_check/data_primary_4Evv/archives/000000010000000000000002’: No such file or directory
LOG:  redo starts at 0/2000028
LOG:  consistent recovery state reached at 0/3000000
LOG:  database system is ready to accept read only connections
cp: cannot stat ‘/home/dilipkumar/work/PG/postgresql/src/test/recovery/tmp_check/data_primary_4Evv/archives/000000010000000000000003’: No such file or directory
LOG:  fetching timeline history file for timeline 2 from primary server
LOG:  primary server contains no more WAL on requested timeline 1
cp: cannot stat ‘/home/dilipkumar/work/PG/postgresql/src/test/recovery/tmp_check/data_primary_4Evv/archives/000000010000000000000003’: No such file or directory
LOG:  primary server contains no more WAL on requested timeline 1
cp: cannot stat ‘/home/dilipkumar/work/PG/postgresql/src/test/recovery/tmp_check/data_primary_4Evv/archives/000000010000000000000003’: No such file or directory
LOG:  primary server contains no more WAL on requested timeline 1
cp: cannot stat ‘/home/dilipkumar/work/PG/postgresql/src/test/recovery/tmp_check/data_primary_4Evv/archives/000000010000000000000003’: No such file or directory
LOG:  primary server contains no more WAL on requested timeline 1
cp: cannot stat ‘/home/dilipkumar/work/PG/postgresql/src/test/recovery/tmp_check/data_primary_4Evv/archives/000000010000000000000003’: No such file or directory
LOG:  primary server contains no more WAL on requested timeline 1
cp: cannot stat ‘/home/dilipkumar/work/PG/postgresql/src/test/recovery/tmp_check/data_primary_4Evv/archives/000000010000000000000003’: No such file or directory
LOG:  primary server contains no more WAL on requested timeline 1
cp: cannot stat ‘/home/dilipkumar/work/PG/postgresql/src/test/recovery/tmp_check/data_primary_4Evv/archives/000000010000000000000003’: No such file or directory
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
FATAL:  could not receive database system identifier and timeline ID from the primary server: server closed the connection unexpectedly
		This probably means the server terminated abnormally
		before or while processing the request.
	
cp: cannot stat ‘/home/dilipkumar/work/PG/postgresql/src/test/recovery/tmp_check/data_primary_4Evv/archives/000000010000000000000003’: No such file or directory
FATAL:  could not connect to the primary server: could not connect to server: No such file or directory
		Is the server running locally and accepting
		connections on Unix domain socket "/tmp/H1JFqD4gm3/.s.PGSQL.62229"?
	
LOG:  received immediate shutdown request
LOG:  database system is shut down

LOG:  database system was shut down at 2021-06-08 14:13:51 IST
LOG:  MultiXact member wraparound protections are now enabled
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started
LOG:  received immediate shutdown request
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
LOG:  archiver process (PID 66818) exited with exit code 2
LOG:  database system is shut down

LOG:  database system was interrupted; last known up at 2021-06-08 14:13:52 IST
LOG:  entering standby mode
LOG:  redo starts at 0/2000028
LOG:  consistent recovery state reached at 0/20000F8
LOG:  database system is ready to accept read only connections
LOG:  started streaming WAL from primary at 0/3000000 on timeline 1
LOG:  received promote request
FATAL:  terminating walreceiver process due to administrator command
LOG:  redo done at 0/20000F8
LOG:  selected new timeline ID: 2
LOG:  statement: SELECT pg_is_in_recovery() = 'f';
LOG:  archive recovery complete
LOG:  MultiXact member wraparound protections are now enabled
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started
LOG:  statement: SELECT pg_is_in_recovery() = 'f';
LOG:  statement: SELECT pg_xlogfile_name(pg_current_xlog_location());
LOG:  statement: SELECT pg_switch_xlog()
LOG:  statement: SELECT '000000020000000000000003' <= last_archived_wal FROM pg_stat_archiver;
LOG:  statement: CREATE TABLE tab_int AS SELECT 1 AS a
LOG:  received immediate shutdown request
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
LOG:  archiver process (PID 66845) exited with exit code 2
LOG:  database system is shut down

regress_log_025_stuck_on_old_timeline
Description: Binary data

Re: Race condition in recovery?

Reply via email to