On Mon, Oct 30, 2023 at 12:47:41PM -0700, Andres Freund wrote:
> I think the problem with these variables is that they're a really messy state
> machine - something this patch doesn't meaningfully improve IMO.

Okay.  Yes, this is my root issue as well.  We're at the stage where
we should reduce the possible set of combinations and assumptions
we're inventing because people can do undocumented stuff, then perhaps
refactor the code on top of that (say, if one combination with too
booleans is not possible, switch to a three-state enum rather than 2
bools, etc).

>> This configuration was possible when recovering from a base backup taken
>> by pg_basebackup without -R.  Note that the documentation requires at
>> least to set recovery.signal to restore from a backup, but the startup
>> process was not making this policy explicit.
> 
> Maybe I just didn't check the right place, but from I saw, this, at most, is
> implied, rather than explicitly stated.

See the doc reference here:
https://www.postgresql.org/message-id/zubm6bnqneh7l...@paquier.xyz

So it kind of implies it, still also mentions restore_command.  It's
like Schrödinger's cat, yes and no at the same time.

> With -X ... we have all the necessary WAL locally, how does the workload on
> the primary matter? If you pass --no-slot, pg_basebackup might fail to fetch
> the necessary wal, but then you'd also have gotten an error.
>
> [...]
> 
> Right now running pg_basebackup with -X stream, without --write-recovery-conf,
> gives you a copy of a cluster that will come up correctly as a distinct
> instance.
>
> [...]
> 
> I also just don't think that it's always desirable to create a new timeline.

Yeah.  Another argument I was mentioning to Robert is that we may want
to just treat the case where you have a backup_label without any
signal files just the same as crash recovery, replaying all the local
pg_wal/, and nothing else.  For example, something like the attached
should make sure that InArchiveRecovery=true should never be set if
ArchiveRecoveryRequested is not set.

The attached would still cause redo to complain on a "WAL ends before
end of online backup" if not all the WAL is here (reason behind the
tweak of 010_pg_basebackup.pl, but the previous tweak to pg_rewind's
008_min_recovery_point.pl is not required here).

Attached is the idea I had in mind, in terms of code, FWIW.
--
Michael
From fbd9fa407c04799ad4401d3ba5c6b67a0d922631 Mon Sep 17 00:00:00 2001
From: Michael Paquier <mich...@paquier.xyz>
Date: Tue, 31 Oct 2023 10:14:01 +0900
Subject: [PATCH] Force crash recovery with backup_label and no .signal files

---
 src/backend/access/transam/xlogrecovery.c    | 29 ++++++++++++++++----
 src/bin/pg_basebackup/t/010_pg_basebackup.pl |  3 +-
 doc/src/sgml/backup.sgml                     |  4 ++-
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index c61566666a..de5787d7e8 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -133,6 +133,8 @@ static TimeLineID curFileTLI;
  * currently performing crash recovery using only XLOG files in pg_wal, but
  * will switch to using offline XLOG archives as soon as we reach the end of
  * WAL in pg_wal.
+ *
+ * InArchiveRecovery should never be set without ArchiveRecoveryRequested.
  */
 bool		ArchiveRecoveryRequested = false;
 bool		InArchiveRecovery = false;
@@ -595,13 +597,22 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		List	   *tablespaces = NIL;
 
 		/*
-		 * Archive recovery was requested, and thanks to the backup label
-		 * file, we know how far we need to replay to reach consistency. Enter
-		 * archive recovery directly.
+		 * If archive recovery was requested, and we know how far we need to
+		 * replay to reach consistency thanks to the backup label file, then
+		 * enter archive recovery directly in this case.
+		 *
+		 * If archive recovery was not requested, then do crash recovery and
+		 * replay all the local WAL.  This still checks that all the WAL up
+		 * to backupEndRequired has been replayed.  This case is useful when
+		 * restoring from a standalone base backup, taken with pg_basebackup
+		 * --wal-method=stream, for example.
 		 */
-		InArchiveRecovery = true;
-		if (StandbyModeRequested)
-			EnableStandbyMode();
+		if (ArchiveRecoveryRequested)
+		{
+			InArchiveRecovery = true;
+			if (StandbyModeRequested)
+				EnableStandbyMode();
+		}
 
 		/*
 		 * When a backup_label file is present, we want to roll forward from
@@ -1591,6 +1602,12 @@ ShutdownWalRecovery(void)
 	 */
 	if (ArchiveRecoveryRequested)
 		DisownLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
+
+	/*
+	 * InArchiveRecovery should never have been set without
+	 * ArchiveRecoveryRequested.
+	 */
+	Assert(ArchiveRecoveryRequested || !InArchiveRecovery);
 }
 
 /*
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index b9f5e1266b..b9e54f4562 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -392,7 +392,8 @@ SKIP:
 	my $node2 = PostgreSQL::Test::Cluster->new('replica');
 
 	# Recover main data directory
-	$node2->init_from_backup($node, 'tarbackup2', tar_program => $tar);
+	$node2->init_from_backup($node, 'tarbackup2', tar_program => $tar,
+	    has_restoring => 1);
 
 	# Recover tablespace into a new directory (not where it was!)
 	my $repTsDir = "$tempdir/tblspc1replica";
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 8cb24d6ae5..5ba7a284cf 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -1380,7 +1380,9 @@ restore_command = 'cp /mnt/server/archivedir/%f %p'
       tool. If you include the <literal>-X</literal> parameter when calling
       it, all the write-ahead log required to use the backup will be
       included in the backup automatically, and no special action is
-      required to restore the backup.
+      required to restore the backup. Restoring a standalone backup is
+      equivalent to crash recovery, replaying all the WAL stored in
+      <filename>pg_wal</filename> up to its end.
      </para>
     </sect3>
 
-- 
2.42.0

Attachment: signature.asc
Description: PGP signature

Reply via email to