On Thu, Mar 06, 2025 at 01:44:30PM -0600, Nathan Bossart wrote:
> On Thu, Mar 06, 2025 at 11:30:13AM -0800, Noah Misch wrote:
>> 1. Make v14 and v13 skip WAL recycling and preallocation during archive
>>    recovery, like newer branches do.  I think that means back-patching the 
>> six
>>    commits cc2c7d6~4 cc2c7d6~3 cc2c7d6~2 cc2c7d6~1 cc2c7d6 e36cbef.
>> 
>> 2. Revert 1f95181b44c843729caaa688f74babe9403b5850 and its v13 counterpart.
>>    Avoid multiple hard links by making durable_rename_excl() first rename
>>    oldfile to a temporary name.  (I haven't thought this through in detail.
>>    It may not suffice.)
>> 
>> I'm leaning toward (1) at least enough to see how messy the back-patch would
>> be, since I don't like risks of designing old-branch-specific solutions when
>> v15/v16/v17 have a proven solution.  What else should we be thinking about
>> before deciding?
> 
> I agree with first attempting a back-patch.  All of this stuff is from
> 2021-2022, so it's had time to bake and is IMHO lower risk.

Sorry for the late reply here (life things and the timing of Noah's
report).

Discarding the option where we would design a new solution specific to
v14 and v13, I don't really see a different third option here.

Choice 2 means that we'd just leave v13 and v14 exposed to
corruptions, and that the message to our user base is that they should
upgrade to v15 to get that fixed, even if we still have roughly two
more 2 years of official community support, as they would still be
exposed to the risk of having multiple links pointing to a single WAL
segment, creating corruption.  As reported upthread, that can be
reached.

Choice 1 would be the logical path, even if it may not be the
practical one in terms of backpatch risks vs benefits.  It would be
much better than recommending people that they have to upgrade to 15
at least to make sure that their version of Postgres is safer to use.

Saying that, I've looked at the whole for a good part of the day.  In
15~, one thing that may make a backpatch messier is related to
recovery and the fact that Robert has improved/refactored the area
quite a bit, tweaking the code so as we rely on slightly different
assumptions in 13/14 compared to 15~.  There is a piece around
ThisTimeLineID, for example, which is something that
InstallXLogFileSegment() uses and also something that's touched by
1f95181b44c8.  The impact of these pieces of refactorings specific to
v15 is surprisingly lower than I thought it would.

Some notes about the six commits you are mentioning, looking at them
one by one on the 13 and 14 stable branches.

First, for v14:
cc2c7d6~4: some bumps with the removal of use_lock with one flag.
Looks OK otherwise.
cc2c7d6~3: Looks OK.  Based on what this addresses, this is low-risk.
cc2c7d6~2: XLogWalRcvWrite() has a conflict as an effect of
XLogWalRcvWrite() that can be given a "tli".  It still seems correct
to me to use "recvFileTLI = ThisTimeLineID".
cc2c7d6~1: 8ad6c5dbbe5a needs to be reverted first.  Commit
8ad6c5dbbe5a has been introduced as a workaround for what we would
backpatch here.  One conflict in walreceiver.c.
cc2c7d6: Does not conflict, which was a bit surprising, TBH.
043_no_contrecord_switch.pl gets unhappy here, on an assertion failure
in WaitForWALToBecomeAvailable() about InstallXLogFileSegmentActive.
e36cbef: Previous failure fixed by this one, with the same symptoms
as reported in the original thread that led to this commit.

Then, for v13:
cc2c7d6~4: One conflict with InstallXLogFileSegment() in xlog.c,
nothing huge.
cc2c7d6~3: Looks OK.
cc2c7d6~2: Looks OK.
cc2c7d6~1: With 8ad6c5dbbe5a reverted, same as the v14 part.
cc2c7d6: Does not conflict.  043_no_contrecord_switch.pl gets unhappy
here, same as above.
e36cbef: Same as above.

Noah, all these improvements are something you have directly worked
on in 15~.  I'm hoping not to have missed something, even if I've
looked at the paths manipulating the WAL segments changed here on v13
and v14.  Could it be possible for you to double-check and also run
more tests if your enviroments help?  Perhaps this needs specific
prerequisites for recovery?  We need more confidence in the solution
here, even if it's proving to work for v15, we may still have gaps
specific to v14 and/or v13.

I am attaching a full set of patches for v14 and v13 that can be used
for these tests.  WDYT?
--
Michael
From 27d2e964c585a2309bd340d856fe9e2402587b0e Mon Sep 17 00:00:00 2001
From: Noah Misch <n...@leadboat.com>
Date: Mon, 28 Jun 2021 18:34:55 -0700
Subject: [PATCH v14 1/7] Remove XLogFileInit() ability to skip
 ControlFileLock.

Cold paths, initdb and end-of-recovery, used it.  Don't optimize them.

Discussion: https://postgr.es/m/20210202151416.gb3304...@rfd.leadboat.com
---
 src/include/access/xlog.h             |  2 +-
 src/backend/access/transam/xlog.c     | 46 ++++++++-------------------
 src/backend/replication/walreceiver.c |  2 +-
 3 files changed, 16 insertions(+), 34 deletions(-)

diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index ee3e369b79f3..596965d353d0 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -296,7 +296,7 @@ extern XLogRecPtr XLogInsertRecord(struct XLogRecData *rdata,
 extern void XLogFlush(XLogRecPtr RecPtr);
 extern bool XLogBackgroundFlush(void);
 extern bool XLogNeedsFlush(XLogRecPtr RecPtr);
-extern int	XLogFileInit(XLogSegNo segno, bool *use_existent, bool use_lock);
+extern int	XLogFileInit(XLogSegNo segno, bool *use_existent);
 extern int	XLogFileOpen(XLogSegNo segno);
 
 extern void CheckXLogRemoved(XLogSegNo segno, TimeLineID tli);
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0334c2d3bba2..20449820c7cb 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -927,8 +927,7 @@ static void AdvanceXLInsertBuffer(XLogRecPtr upto, bool opportunistic);
 static bool XLogCheckpointNeeded(XLogSegNo new_segno);
 static void XLogWrite(XLogwrtRqst WriteRqst, bool flexible);
 static bool InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
-								   bool find_free, XLogSegNo max_segno,
-								   bool use_lock);
+								   bool find_free, XLogSegNo max_segno);
 static int	XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
 						 XLogSource source, bool notfoundOk);
 static int	XLogFileReadAnyTLI(XLogSegNo segno, int emode, XLogSource source);
@@ -2520,7 +2519,7 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
 
 			/* create/use new log file */
 			use_existent = true;
-			openLogFile = XLogFileInit(openLogSegNo, &use_existent, true);
+			openLogFile = XLogFileInit(openLogSegNo, &use_existent);
 			ReserveExternalFD();
 		}
 
@@ -3293,10 +3292,6 @@ XLogNeedsFlush(XLogRecPtr record)
  * pre-existing file will be deleted).  On return, true if a pre-existing
  * file was used.
  *
- * use_lock: if true, acquire ControlFileLock while moving file into
- * place.  This should be true except during bootstrap log creation.  The
- * caller must *not* hold the lock at call.
- *
  * Returns FD of opened file.
  *
  * Note: errors here are ERROR not PANIC because we might or might not be
@@ -3305,7 +3300,7 @@ XLogNeedsFlush(XLogRecPtr record)
  * in a critical section.
  */
 int
-XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
+XLogFileInit(XLogSegNo logsegno, bool *use_existent)
 {
 	char		path[MAXPGPATH];
 	char		tmppath[MAXPGPATH];
@@ -3465,8 +3460,7 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
 	 */
 	max_segno = logsegno + CheckPointSegments;
 	if (!InstallXLogFileSegment(&installed_segno, tmppath,
-								*use_existent, max_segno,
-								use_lock))
+								*use_existent, max_segno))
 	{
 		/*
 		 * No need for any more future segments, or InstallXLogFileSegment()
@@ -3623,7 +3617,7 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
 	/*
 	 * Now move the segment into place with its final name.
 	 */
-	if (!InstallXLogFileSegment(&destsegno, tmppath, false, 0, false))
+	if (!InstallXLogFileSegment(&destsegno, tmppath, false, 0))
 		elog(ERROR, "InstallXLogFileSegment should not have failed");
 }
 
@@ -3647,29 +3641,20 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
  * free slot is found between *segno and max_segno. (Ignored when find_free
  * is false.)
  *
- * use_lock: if true, acquire ControlFileLock while moving file into
- * place.  This should be true except during bootstrap log creation.  The
- * caller must *not* hold the lock at call.
- *
  * Returns true if the file was installed successfully.  false indicates that
  * max_segno limit was exceeded, or an error occurred while renaming the
  * file into place.
  */
 static bool
 InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
-					   bool find_free, XLogSegNo max_segno,
-					   bool use_lock)
+					   bool find_free, XLogSegNo max_segno)
 {
 	char		path[MAXPGPATH];
 	struct stat stat_buf;
 
 	XLogFilePath(path, ThisTimeLineID, *segno, wal_segment_size);
 
-	/*
-	 * We want to be sure that only one process does this at a time.
-	 */
-	if (use_lock)
-		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 
 	if (!find_free)
 	{
@@ -3684,8 +3669,7 @@ InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
 			if ((*segno) >= max_segno)
 			{
 				/* Failed to find a free slot within specified range */
-				if (use_lock)
-					LWLockRelease(ControlFileLock);
+				LWLockRelease(ControlFileLock);
 				return false;
 			}
 			(*segno)++;
@@ -3696,14 +3680,12 @@ InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
 	Assert(access(path, F_OK) != 0 && errno == ENOENT);
 	if (durable_rename(tmppath, path, LOG) != 0)
 	{
-		if (use_lock)
-			LWLockRelease(ControlFileLock);
+		LWLockRelease(ControlFileLock);
 		/* durable_rename already emitted log message */
 		return false;
 	}
 
-	if (use_lock)
-		LWLockRelease(ControlFileLock);
+	LWLockRelease(ControlFileLock);
 
 	return true;
 }
@@ -3974,7 +3956,7 @@ PreallocXlogFiles(XLogRecPtr endptr)
 	{
 		_logSegNo++;
 		use_existent = true;
-		lf = XLogFileInit(_logSegNo, &use_existent, true);
+		lf = XLogFileInit(_logSegNo, &use_existent);
 		close(lf);
 		if (!use_existent)
 			CheckpointStats.ckpt_segs_added++;
@@ -4251,7 +4233,7 @@ RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
 		*endlogSegNo <= recycleSegNo &&
 		lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) &&
 		InstallXLogFileSegment(endlogSegNo, path,
-							   true, recycleSegNo, true))
+							   true, recycleSegNo))
 	{
 		ereport(DEBUG2,
 				(errmsg_internal("recycled write-ahead log file \"%s\"",
@@ -5388,7 +5370,7 @@ BootStrapXLOG(void)
 
 	/* Create first XLOG segment file */
 	use_existent = false;
-	openLogFile = XLogFileInit(1, &use_existent, false);
+	openLogFile = XLogFileInit(1, &use_existent);
 
 	/*
 	 * We needn't bother with Reserve/ReleaseExternalFD here, since we'll
@@ -5697,7 +5679,7 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog)
 		bool		use_existent = true;
 		int			fd;
 
-		fd = XLogFileInit(startLogSegNo, &use_existent, true);
+		fd = XLogFileInit(startLogSegNo, &use_existent);
 
 		if (close(fd) != 0)
 		{
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 4831a259c48a..4ad20f12b61b 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -894,7 +894,7 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr)
 
 			/* Create/use new log file */
 			XLByteToSeg(recptr, recvSegNo, wal_segment_size);
-			recvFile = XLogFileInit(recvSegNo, &use_existent, true);
+			recvFile = XLogFileInit(recvSegNo, &use_existent);
 			recvFileTLI = ThisTimeLineID;
 		}
 
-- 
2.47.2

From 019002a9e1975c101b7c84f62d89cf48a136c851 Mon Sep 17 00:00:00 2001
From: Noah Misch <n...@leadboat.com>
Date: Mon, 28 Jun 2021 18:34:55 -0700
Subject: [PATCH v14 2/7] In XLogFileInit(), fix *use_existent postcondition to
 suit callers.

Infrequently, the mismatch caused log_checkpoints messages and
TRACE_POSTGRESQL_CHECKPOINT_DONE() to witness an "added" count too high
by one.  Since that consequence is so minor, no back-patch.

Discussion: https://postgr.es/m/20210202151416.gb3304...@rfd.leadboat.com
---
 src/backend/access/transam/xlog.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 20449820c7cb..c9176eea68c4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3289,8 +3289,8 @@ XLogNeedsFlush(XLogRecPtr record)
  * logsegno: identify segment to be created/opened.
  *
  * *use_existent: if true, OK to use a pre-existing file (else, any
- * pre-existing file will be deleted).  On return, true if a pre-existing
- * file was used.
+ * pre-existing file will be deleted).  On return, false iff this call added
+ * some segment on disk.
  *
  * Returns FD of opened file.
  *
@@ -3459,8 +3459,10 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent)
 	 * CheckPointSegments.
 	 */
 	max_segno = logsegno + CheckPointSegments;
-	if (!InstallXLogFileSegment(&installed_segno, tmppath,
-								*use_existent, max_segno))
+	if (InstallXLogFileSegment(&installed_segno, tmppath,
+							   *use_existent, max_segno))
+		*use_existent = false;
+	else
 	{
 		/*
 		 * No need for any more future segments, or InstallXLogFileSegment()
@@ -3470,9 +3472,6 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent)
 		unlink(tmppath);
 	}
 
-	/* Set flag to tell caller there was no existent file */
-	*use_existent = false;
-
 	/* Now open original target segment (might not be file I just made) */
 	fd = BasicOpenFile(path, O_RDWR | PG_BINARY | get_sync_bit(sync_method));
 	if (fd < 0)
-- 
2.47.2

From 8f4517030288192c81e32ddf456028dade87104e Mon Sep 17 00:00:00 2001
From: Noah Misch <n...@leadboat.com>
Date: Mon, 28 Jun 2021 18:34:56 -0700
Subject: [PATCH v14 3/7] Remove XLogFileInit() ability to unlink a
 pre-existing file.

Only initdb used it.  initdb refuses to operate on a non-empty directory
and generally does not cope with pre-existing files of other kinds.
Hence, use the opportunity to simplify.

Discussion: https://postgr.es/m/20210202151416.gb3304...@rfd.leadboat.com
---
 src/include/access/xlog.h             |  2 +-
 src/backend/access/transam/xlog.c     | 61 +++++++++++----------------
 src/backend/replication/walreceiver.c |  4 +-
 3 files changed, 28 insertions(+), 39 deletions(-)

diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 596965d353d0..1860107f9a2e 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -296,7 +296,7 @@ extern XLogRecPtr XLogInsertRecord(struct XLogRecData *rdata,
 extern void XLogFlush(XLogRecPtr RecPtr);
 extern bool XLogBackgroundFlush(void);
 extern bool XLogNeedsFlush(XLogRecPtr RecPtr);
-extern int	XLogFileInit(XLogSegNo segno, bool *use_existent);
+extern int	XLogFileInit(XLogSegNo segno, bool *added);
 extern int	XLogFileOpen(XLogSegNo segno);
 
 extern void CheckXLogRemoved(XLogSegNo segno, TimeLineID tli);
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c9176eea68c4..d86295470908 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -2452,7 +2452,7 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
 	bool		ispartialpage;
 	bool		last_iteration;
 	bool		finishing_seg;
-	bool		use_existent;
+	bool		added;
 	int			curridx;
 	int			npages;
 	int			startidx;
@@ -2518,8 +2518,7 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
 							wal_segment_size);
 
 			/* create/use new log file */
-			use_existent = true;
-			openLogFile = XLogFileInit(openLogSegNo, &use_existent);
+			openLogFile = XLogFileInit(openLogSegNo, &added);
 			ReserveExternalFD();
 		}
 
@@ -3288,9 +3287,7 @@ XLogNeedsFlush(XLogRecPtr record)
  *
  * logsegno: identify segment to be created/opened.
  *
- * *use_existent: if true, OK to use a pre-existing file (else, any
- * pre-existing file will be deleted).  On return, false iff this call added
- * some segment on disk.
+ * *added: on return, true if this call raised the number of extant segments.
  *
  * Returns FD of opened file.
  *
@@ -3300,7 +3297,7 @@ XLogNeedsFlush(XLogRecPtr record)
  * in a critical section.
  */
 int
-XLogFileInit(XLogSegNo logsegno, bool *use_existent)
+XLogFileInit(XLogSegNo logsegno, bool *added)
 {
 	char		path[MAXPGPATH];
 	char		tmppath[MAXPGPATH];
@@ -3315,19 +3312,17 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent)
 	/*
 	 * Try to use existent file (checkpoint maker may have created it already)
 	 */
-	if (*use_existent)
+	*added = false;
+	fd = BasicOpenFile(path, O_RDWR | PG_BINARY | get_sync_bit(sync_method));
+	if (fd < 0)
 	{
-		fd = BasicOpenFile(path, O_RDWR | PG_BINARY | get_sync_bit(sync_method));
-		if (fd < 0)
-		{
-			if (errno != ENOENT)
-				ereport(ERROR,
-						(errcode_for_file_access(),
-						 errmsg("could not open file \"%s\": %m", path)));
-		}
-		else
-			return fd;
+		if (errno != ENOENT)
+			ereport(ERROR,
+					(errcode_for_file_access(),
+					 errmsg("could not open file \"%s\": %m", path)));
 	}
+	else
+		return fd;
 
 	/*
 	 * Initialize an empty (all zeroes) segment.  NOTE: it is possible that
@@ -3440,12 +3435,9 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent)
 				 errmsg("could not close file \"%s\": %m", tmppath)));
 
 	/*
-	 * Now move the segment into place with its final name.
-	 *
-	 * If caller didn't want to use a pre-existing file, get rid of any
-	 * pre-existing file.  Otherwise, cope with possibility that someone else
-	 * has created the file while we were filling ours: if so, use ours to
-	 * pre-create a future log segment.
+	 * Now move the segment into place with its final name.  Cope with
+	 * possibility that someone else has created the file while we were
+	 * filling ours: if so, use ours to pre-create a future log segment.
 	 */
 	installed_segno = logsegno;
 
@@ -3459,9 +3451,8 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent)
 	 * CheckPointSegments.
 	 */
 	max_segno = logsegno + CheckPointSegments;
-	if (InstallXLogFileSegment(&installed_segno, tmppath,
-							   *use_existent, max_segno))
-		*use_existent = false;
+	if (InstallXLogFileSegment(&installed_segno, tmppath, true, max_segno))
+		*added = true;
 	else
 	{
 		/*
@@ -3946,7 +3937,7 @@ PreallocXlogFiles(XLogRecPtr endptr)
 {
 	XLogSegNo	_logSegNo;
 	int			lf;
-	bool		use_existent;
+	bool		added;
 	uint64		offset;
 
 	XLByteToPrevSeg(endptr, _logSegNo, wal_segment_size);
@@ -3954,10 +3945,9 @@ PreallocXlogFiles(XLogRecPtr endptr)
 	if (offset >= (uint32) (0.75 * wal_segment_size))
 	{
 		_logSegNo++;
-		use_existent = true;
-		lf = XLogFileInit(_logSegNo, &use_existent);
+		lf = XLogFileInit(_logSegNo, &added);
 		close(lf);
-		if (!use_existent)
+		if (added)
 			CheckpointStats.ckpt_segs_added++;
 	}
 }
@@ -5271,7 +5261,7 @@ BootStrapXLOG(void)
 	XLogLongPageHeader longpage;
 	XLogRecord *record;
 	char	   *recptr;
-	bool		use_existent;
+	bool		added;
 	uint64		sysidentifier;
 	struct timeval tv;
 	pg_crc32c	crc;
@@ -5368,8 +5358,7 @@ BootStrapXLOG(void)
 	record->xl_crc = crc;
 
 	/* Create first XLOG segment file */
-	use_existent = false;
-	openLogFile = XLogFileInit(1, &use_existent);
+	openLogFile = XLogFileInit(1, &added);
 
 	/*
 	 * We needn't bother with Reserve/ReleaseExternalFD here, since we'll
@@ -5675,10 +5664,10 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog)
 		 * The switch happened at a segment boundary, so just create the next
 		 * segment on the new timeline.
 		 */
-		bool		use_existent = true;
+		bool		added;
 		int			fd;
 
-		fd = XLogFileInit(startLogSegNo, &use_existent);
+		fd = XLogFileInit(startLogSegNo, &added);
 
 		if (close(fd) != 0)
 		{
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 4ad20f12b61b..b6f07ab81f1d 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -890,11 +890,11 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr)
 
 		if (recvFile < 0)
 		{
-			bool		use_existent = true;
+			bool		added = true;
 
 			/* Create/use new log file */
 			XLByteToSeg(recptr, recvSegNo, wal_segment_size);
-			recvFile = XLogFileInit(recvSegNo, &use_existent);
+			recvFile = XLogFileInit(recvSegNo, &added);
 			recvFileTLI = ThisTimeLineID;
 		}
 
-- 
2.47.2

From 3086175a74b701e1ae1ddf6423cf796b6e2bde4c Mon Sep 17 00:00:00 2001
From: Michael Paquier <mich...@paquier.xyz>
Date: Mon, 10 Mar 2025 13:18:52 +0900
Subject: [PATCH v14 4/7] Revert "Add HINT for restartpoint race with
 KeepFileRestoredFromArchive()."

This reverts commit 8ad6c5dbbe5a234c55c6663020db297251756006.
---
 src/backend/access/transam/xlog.c |  5 +----
 src/backend/storage/file/fd.c     | 10 ++--------
 2 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d86295470908..793451bdf70c 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3468,10 +3468,7 @@ XLogFileInit(XLogSegNo logsegno, bool *added)
 	if (fd < 0)
 		ereport(ERROR,
 				(errcode_for_file_access(),
-				 errmsg("could not open file \"%s\": %m", path),
-				 (AmCheckpointerProcess() ?
-				  errhint("This is known to fail occasionally during archive recovery, where it is harmless.") :
-				  0)));
+				 errmsg("could not open file \"%s\": %m", path)));
 
 	elog(DEBUG2, "done creating and filling new WAL file");
 
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 09c4a93e40b0..484a07302a40 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -834,10 +834,7 @@ durable_rename_excl(const char *oldfile, const char *newfile, int elevel)
 		ereport(elevel,
 				(errcode_for_file_access(),
 				 errmsg("could not link file \"%s\" to \"%s\": %m",
-						oldfile, newfile),
-				 (AmCheckpointerProcess() ?
-				  errhint("This is known to fail occasionally during archive recovery, where it is harmless.") :
-				  0)));
+						oldfile, newfile)));
 		return -1;
 	}
 	unlink(oldfile);
@@ -847,10 +844,7 @@ durable_rename_excl(const char *oldfile, const char *newfile, int elevel)
 		ereport(elevel,
 				(errcode_for_file_access(),
 				 errmsg("could not rename file \"%s\" to \"%s\": %m",
-						oldfile, newfile),
-				 (AmCheckpointerProcess() ?
-				  errhint("This is known to fail occasionally during archive recovery, where it is harmless.") :
-				  0)));
+						oldfile, newfile)));
 		return -1;
 	}
 #endif
-- 
2.47.2

From 712671efd7b77fada6dad308dbaaba28cab22064 Mon Sep 17 00:00:00 2001
From: Noah Misch <n...@leadboat.com>
Date: Mon, 28 Jun 2021 18:34:56 -0700
Subject: [PATCH v14 5/7] Don't ERROR on PreallocXlogFiles() race condition.

Before a restartpoint finishes PreallocXlogFiles(), a startup process
KeepFileRestoredFromArchive() call can unlink the preallocated segment.
If a CHECKPOINT sql command had elicited the restartpoint experiencing
the race condition, that sql command failed.  Moreover, the restartpoint
omitted its log_checkpoints message and some inessential resource
reclamation.  Prevent the ERROR by skipping open() of the segment.
Since these consequences are so minor, no back-patch.

Discussion: https://postgr.es/m/20210202151416.gb3304...@rfd.leadboat.com
---
 src/include/access/xlog.h             |  2 +-
 src/backend/access/transam/xlog.c     | 79 +++++++++++++++++++--------
 src/backend/replication/walreceiver.c |  4 +-
 3 files changed, 58 insertions(+), 27 deletions(-)

diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 1860107f9a2e..a0706c06a791 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -296,7 +296,7 @@ extern XLogRecPtr XLogInsertRecord(struct XLogRecData *rdata,
 extern void XLogFlush(XLogRecPtr RecPtr);
 extern bool XLogBackgroundFlush(void);
 extern bool XLogNeedsFlush(XLogRecPtr RecPtr);
-extern int	XLogFileInit(XLogSegNo segno, bool *added);
+extern int	XLogFileInit(XLogSegNo segno);
 extern int	XLogFileOpen(XLogSegNo segno);
 
 extern void CheckXLogRemoved(XLogSegNo segno, TimeLineID tli);
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 793451bdf70c..73a828f075ad 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -2452,7 +2452,6 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
 	bool		ispartialpage;
 	bool		last_iteration;
 	bool		finishing_seg;
-	bool		added;
 	int			curridx;
 	int			npages;
 	int			startidx;
@@ -2518,7 +2517,7 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
 							wal_segment_size);
 
 			/* create/use new log file */
-			openLogFile = XLogFileInit(openLogSegNo, &added);
+			openLogFile = XLogFileInit(openLogSegNo);
 			ReserveExternalFD();
 		}
 
@@ -3283,23 +3282,21 @@ XLogNeedsFlush(XLogRecPtr record)
 }
 
 /*
- * Create a new XLOG file segment, or open a pre-existing one.
+ * Try to make a given XLOG file segment exist.
  *
- * logsegno: identify segment to be created/opened.
+ * logsegno: identify segment.
  *
  * *added: on return, true if this call raised the number of extant segments.
  *
- * Returns FD of opened file.
+ * path: on return, this char[MAXPGPATH] has the path to the logsegno file.
  *
- * Note: errors here are ERROR not PANIC because we might or might not be
- * inside a critical section (eg, during checkpoint there is no reason to
- * take down the system on failure).  They will promote to PANIC if we are
- * in a critical section.
+ * Returns -1 or FD of opened file.  A -1 here is not an error; a caller
+ * wanting an open segment should attempt to open "path", which usually will
+ * succeed.  (This is weird, but it's efficient for the callers.)
  */
-int
-XLogFileInit(XLogSegNo logsegno, bool *added)
+static int
+XLogFileInitInternal(XLogSegNo logsegno, bool *added, char *path)
 {
-	char		path[MAXPGPATH];
 	char		tmppath[MAXPGPATH];
 	PGAlignedXLogBlock zbuffer;
 	XLogSegNo	installed_segno;
@@ -3452,26 +3449,53 @@ XLogFileInit(XLogSegNo logsegno, bool *added)
 	 */
 	max_segno = logsegno + CheckPointSegments;
 	if (InstallXLogFileSegment(&installed_segno, tmppath, true, max_segno))
+	{
 		*added = true;
+		elog(DEBUG2, "done creating and filling new WAL file");
+	}
 	else
 	{
 		/*
 		 * No need for any more future segments, or InstallXLogFileSegment()
-		 * failed to rename the file into place. If the rename failed, opening
-		 * the file below will fail.
+		 * failed to rename the file into place. If the rename failed, a
+		 * caller opening the file may fail.
 		 */
 		unlink(tmppath);
+		elog(DEBUG2, "abandoned new WAL file");
 	}
 
+	return -1;
+}
+
+/*
+ * Create a new XLOG file segment, or open a pre-existing one.
+ *
+ * logsegno: identify segment to be created/opened.
+ *
+ * Returns FD of opened file.
+ *
+ * Note: errors here are ERROR not PANIC because we might or might not be
+ * inside a critical section (eg, during checkpoint there is no reason to
+ * take down the system on failure).  They will promote to PANIC if we are
+ * in a critical section.
+ */
+int
+XLogFileInit(XLogSegNo logsegno)
+{
+	bool		ignore_added;
+	char		path[MAXPGPATH];
+	int			fd;
+
+	fd = XLogFileInitInternal(logsegno, &ignore_added, path);
+	if (fd >= 0)
+		return fd;
+
 	/* Now open original target segment (might not be file I just made) */
 	fd = BasicOpenFile(path, O_RDWR | PG_BINARY | get_sync_bit(sync_method));
 	if (fd < 0)
 		ereport(ERROR,
 				(errcode_for_file_access(),
 				 errmsg("could not open file \"%s\": %m", path)));
-
-	elog(DEBUG2, "done creating and filling new WAL file");
-
 	return fd;
 }
 
@@ -3928,6 +3952,15 @@ XLogFileClose(void)
  * High-volume systems will be OK once they've built up a sufficient set of
  * recycled log segments, but the startup transient is likely to include
  * a lot of segment creations by foreground processes, which is not so good.
+ *
+ * XLogFileInitInternal() can ereport(ERROR).  All known causes indicate big
+ * trouble; for example, a full filesystem is one cause.  The checkpoint WAL
+ * and/or ControlFile updates already completed.  If a RequestCheckpoint()
+ * initiated the present checkpoint and an ERROR ends this function, the
+ * command that called RequestCheckpoint() fails.  That's not ideal, but it's
+ * not worth contorting more functions to use caller-specified elevel values.
+ * (With or without RequestCheckpoint(), an ERROR forestalls some inessential
+ * reporting and resource reclamation.)
  */
 static void
 PreallocXlogFiles(XLogRecPtr endptr)
@@ -3935,6 +3968,7 @@ PreallocXlogFiles(XLogRecPtr endptr)
 	XLogSegNo	_logSegNo;
 	int			lf;
 	bool		added;
+	char		path[MAXPGPATH];
 	uint64		offset;
 
 	XLByteToPrevSeg(endptr, _logSegNo, wal_segment_size);
@@ -3942,8 +3976,9 @@ PreallocXlogFiles(XLogRecPtr endptr)
 	if (offset >= (uint32) (0.75 * wal_segment_size))
 	{
 		_logSegNo++;
-		lf = XLogFileInit(_logSegNo, &added);
-		close(lf);
+		lf = XLogFileInitInternal(_logSegNo, &added, path);
+		if (lf >= 0)
+			close(lf);
 		if (added)
 			CheckpointStats.ckpt_segs_added++;
 	}
@@ -5258,7 +5293,6 @@ BootStrapXLOG(void)
 	XLogLongPageHeader longpage;
 	XLogRecord *record;
 	char	   *recptr;
-	bool		added;
 	uint64		sysidentifier;
 	struct timeval tv;
 	pg_crc32c	crc;
@@ -5355,7 +5389,7 @@ BootStrapXLOG(void)
 	record->xl_crc = crc;
 
 	/* Create first XLOG segment file */
-	openLogFile = XLogFileInit(1, &added);
+	openLogFile = XLogFileInit(1);
 
 	/*
 	 * We needn't bother with Reserve/ReleaseExternalFD here, since we'll
@@ -5661,10 +5695,9 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog)
 		 * The switch happened at a segment boundary, so just create the next
 		 * segment on the new timeline.
 		 */
-		bool		added;
 		int			fd;
 
-		fd = XLogFileInit(startLogSegNo, &added);
+		fd = XLogFileInit(startLogSegNo);
 
 		if (close(fd) != 0)
 		{
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index b6f07ab81f1d..7ae04b9682c7 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -890,11 +890,9 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr)
 
 		if (recvFile < 0)
 		{
-			bool		added = true;
-
 			/* Create/use new log file */
 			XLByteToSeg(recptr, recvSegNo, wal_segment_size);
-			recvFile = XLogFileInit(recvSegNo, &added);
+			recvFile = XLogFileInit(recvSegNo);
 			recvFileTLI = ThisTimeLineID;
 		}
 
-- 
2.47.2

From 8bf97022fdd53a019c709efb757a5f1d953a1474 Mon Sep 17 00:00:00 2001
From: Noah Misch <n...@leadboat.com>
Date: Mon, 28 Jun 2021 18:34:56 -0700
Subject: [PATCH v14 6/7] Skip WAL recycling and preallocation during archive
 recovery.

The previous commit addressed the chief consequences of a race condition
between InstallXLogFileSegment() and KeepFileRestoredFromArchive().  Fix
three lesser consequences.  A spurious durable_rename_excl() LOG message
remained possible.  KeepFileRestoredFromArchive() wasted the proceeds of
WAL recycling and preallocation.  Finally, XLogFileInitInternal() could
return a descriptor for a file that KeepFileRestoredFromArchive() had
already unlinked.  That felt like a recipe for future bugs.

Discussion: https://postgr.es/m/20210202151416.gb3304...@rfd.leadboat.com
---
 src/backend/access/transam/xlog.c | 65 +++++++++++++++++++++++++++----
 1 file changed, 57 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 73a828f075ad..c5fe7e6fa4cc 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -673,6 +673,16 @@ typedef struct XLogCtlData
 	 */
 	bool		SharedHotStandbyActive;
 
+	/*
+	 * InstallXLogFileSegmentActive indicates whether the checkpointer should
+	 * arrange for future segments by recycling and/or PreallocXlogFiles().
+	 * Protected by ControlFileLock.  Only the startup process changes it.  If
+	 * true, anyone can use InstallXLogFileSegment().  If false, the startup
+	 * process owns the exclusive right to install segments, by reading from
+	 * the archive and possibly replacing existing files.
+	 */
+	bool		InstallXLogFileSegmentActive;
+
 	/*
 	 * SharedPromoteIsTriggered indicates if a standby promotion has been
 	 * triggered.  Protected by info_lck.
@@ -935,6 +945,7 @@ static int	XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr,
 						 int reqLen, XLogRecPtr targetRecPtr, char *readBuf);
 static bool WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 										bool fetching_ckpt, XLogRecPtr tliRecPtr);
+static void XLogShutdownWalRcv(void);
 static int	emode_for_corrupt_record(int emode, XLogRecPtr RecPtr);
 static void XLogFileClose(void);
 static void PreallocXlogFiles(XLogRecPtr endptr);
@@ -3653,8 +3664,8 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
  * is false.)
  *
  * Returns true if the file was installed successfully.  false indicates that
- * max_segno limit was exceeded, or an error occurred while renaming the
- * file into place.
+ * max_segno limit was exceeded, the startup process has disabled this
+ * function for now, or an error occurred while renaming the file into place.
  */
 static bool
 InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
@@ -3666,6 +3677,11 @@ InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
 	XLogFilePath(path, ThisTimeLineID, *segno, wal_segment_size);
 
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	if (!XLogCtl->InstallXLogFileSegmentActive)
+	{
+		LWLockRelease(ControlFileLock);
+		return false;
+	}
 
 	if (!find_free)
 	{
@@ -3770,6 +3786,7 @@ XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
 	 */
 	if (source == XLOG_FROM_ARCHIVE)
 	{
+		Assert(!XLogCtl->InstallXLogFileSegmentActive);
 		KeepFileRestoredFromArchive(path, xlogfname);
 
 		/*
@@ -3971,6 +3988,9 @@ PreallocXlogFiles(XLogRecPtr endptr)
 	char		path[MAXPGPATH];
 	uint64		offset;
 
+	if (!XLogCtl->InstallXLogFileSegmentActive)
+		return;					/* unlocked check says no */
+
 	XLByteToPrevSeg(endptr, _logSegNo, wal_segment_size);
 	offset = XLogSegmentOffset(endptr - 1, wal_segment_size);
 	if (offset >= (uint32) (0.75 * wal_segment_size))
@@ -4252,6 +4272,7 @@ RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
 	 */
 	if (wal_recycle &&
 		*endlogSegNo <= recycleSegNo &&
+		XLogCtl->InstallXLogFileSegmentActive &&	/* callee rechecks this */
 		lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) &&
 		InstallXLogFileSegment(endlogSegNo, path,
 							   true, recycleSegNo))
@@ -4265,7 +4286,7 @@ RemoveXlogFile(const char *segname, XLogSegNo recycleSegNo,
 	}
 	else
 	{
-		/* No need for any more future segments... */
+		/* No need for any more future segments, or recycling failed ... */
 		int			rc;
 
 		ereport(DEBUG2,
@@ -5270,6 +5291,7 @@ XLOGShmemInit(void)
 	XLogCtl->XLogCacheBlck = XLOGbuffers - 1;
 	XLogCtl->SharedRecoveryState = RECOVERY_STATE_CRASH;
 	XLogCtl->SharedHotStandbyActive = false;
+	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->SharedPromoteIsTriggered = false;
 	XLogCtl->WalWriterSleeping = false;
 
@@ -5297,6 +5319,11 @@ BootStrapXLOG(void)
 	struct timeval tv;
 	pg_crc32c	crc;
 
+	/* allow ordinary WAL segment creation, like StartupXLOG() would */
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	XLogCtl->InstallXLogFileSegmentActive = true;
+	LWLockRelease(ControlFileLock);
+
 	/*
 	 * Select a hopefully-unique system identifier code for this installation.
 	 * We use the result of gettimeofday(), including the fractional seconds
@@ -7685,7 +7712,7 @@ StartupXLOG(void)
 	 * over these records and subsequent ones if it's still alive when we
 	 * start writing WAL.
 	 */
-	ShutdownWalRcv();
+	XLogShutdownWalRcv();
 
 	/*
 	 * Reset unlogged relations to the contents of their INIT fork. This is
@@ -7710,7 +7737,7 @@ StartupXLOG(void)
 	 * recovery, e.g., timeline history file) from archive or pg_wal.
 	 *
 	 * Note that standby mode must be turned off after killing WAL receiver,
-	 * i.e., calling ShutdownWalRcv().
+	 * i.e., calling XLogShutdownWalRcv().
 	 */
 	Assert(!WalRcvStreaming());
 	StandbyMode = false;
@@ -7779,6 +7806,14 @@ StartupXLOG(void)
 	 */
 	oldestActiveXID = PrescanPreparedTransactions(NULL, NULL);
 
+	/*
+	 * Allow ordinary WAL segment creation before any exitArchiveRecovery(),
+	 * which sometimes creates a segment, and after the last ReadRecord().
+	 */
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	XLogCtl->InstallXLogFileSegmentActive = true;
+	LWLockRelease(ControlFileLock);
+
 	/*
 	 * Consider whether we need to assign a new timeline ID.
 	 *
@@ -12717,7 +12752,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 					 */
 					if (StandbyMode && CheckForStandbyTrigger())
 					{
-						ShutdownWalRcv();
+						XLogShutdownWalRcv();
 						return false;
 					}
 
@@ -12765,7 +12800,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 					 * WAL that we restore from archive.
 					 */
 					if (WalRcvStreaming())
-						ShutdownWalRcv();
+						XLogShutdownWalRcv();
 
 					/*
 					 * Before we sleep, re-scan for possible new timelines if
@@ -12895,7 +12930,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 					 */
 					if (pendingWalRcvRestart && !startWalReceiver)
 					{
-						ShutdownWalRcv();
+						XLogShutdownWalRcv();
 
 						/*
 						 * Re-scan for possible new timelines if we were
@@ -12945,6 +12980,9 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 									 tli, curFileTLI);
 						}
 						curFileTLI = tli;
+						LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+						XLogCtl->InstallXLogFileSegmentActive = true;
+						LWLockRelease(ControlFileLock);
 						RequestXLogStreaming(tli, ptr, PrimaryConnInfo,
 											 PrimarySlotName,
 											 wal_receiver_create_temp_slot);
@@ -13115,6 +13153,17 @@ StartupRequestWalReceiverRestart(void)
 	}
 }
 
+/* Thin wrapper around ShutdownWalRcv(). */
+static void
+XLogShutdownWalRcv(void)
+{
+	ShutdownWalRcv();
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	XLogCtl->InstallXLogFileSegmentActive = false;
+	LWLockRelease(ControlFileLock);
+}
+
 /*
  * Determine what log level should be used to report a corrupt WAL record
  * in the current WAL page, previously read by XLogPageRead().
-- 
2.47.2

From c2d49a28a5d1a0e48c7a9edf97c481d984af98e9 Mon Sep 17 00:00:00 2001
From: Noah Misch <n...@leadboat.com>
Date: Thu, 15 Sep 2022 06:45:23 -0700
Subject: [PATCH v14 7/7] Reset InstallXLogFileSegmentActive after walreceiver
 self-initiated exit.

After commit cc2c7d65fc27e877c9f407587b0b92d46cd6dd16 added this flag,
failure to reset it caused assertion failures.  In non-assert builds, it
made the system fail to achieve the objectives listed in that commit;
chiefly, we might emit a spurious log message.  Back-patch to v15, where
that commit first appeared.

Bharath Rupireddy and Kyotaro Horiguchi.  Reviewed by Dilip Kumar,
Nathan Bossart and Michael Paquier.  Reported by Dilip Kumar.

Discussion: https://postgr.es/m/CAFiTN-sE3ry=ycMPVtC+Djw4Fd7gbUGVv_qqw6qfzp=jlvq...@mail.gmail.com
---
 src/backend/access/transam/xlog.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c5fe7e6fa4cc..9ce4297e3de6 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -12799,8 +12799,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 					 * walreceiver is not active, so that it won't overwrite
 					 * WAL that we restore from archive.
 					 */
-					if (WalRcvStreaming())
-						XLogShutdownWalRcv();
+					XLogShutdownWalRcv();
 
 					/*
 					 * Before we sleep, re-scan for possible new timelines if
-- 
2.47.2

From d5de431953286182b8b0bfea4c0ecf23c1d31fbc Mon Sep 17 00:00:00 2001
From: Noah Misch <n...@leadboat.com>
Date: Mon, 28 Jun 2021 18:34:55 -0700
Subject: [PATCH v13 1/7] Remove XLogFileInit() ability to skip
 ControlFileLock.

Cold paths, initdb and end-of-recovery, used it.  Don't optimize them.

Discussion: https://postgr.es/m/20210202151416.gb3304...@rfd.leadboat.com
---
 src/include/access/xlog.h             |  2 +-
 src/backend/access/transam/xlog.c     | 46 ++++++++-------------------
 src/backend/replication/walreceiver.c |  2 +-
 3 files changed, 16 insertions(+), 34 deletions(-)

diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 01b3b0a767dc..7cb147a7ee37 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -286,7 +286,7 @@ extern XLogRecPtr XLogInsertRecord(struct XLogRecData *rdata,
 extern void XLogFlush(XLogRecPtr RecPtr);
 extern bool XLogBackgroundFlush(void);
 extern bool XLogNeedsFlush(XLogRecPtr RecPtr);
-extern int	XLogFileInit(XLogSegNo segno, bool *use_existent, bool use_lock);
+extern int	XLogFileInit(XLogSegNo segno, bool *use_existent);
 extern int	XLogFileOpen(XLogSegNo segno);
 
 extern void CheckXLogRemoved(XLogSegNo segno, TimeLineID tli);
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 769f4a202f28..859f86cfc3cd 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -917,8 +917,7 @@ static void AdvanceXLInsertBuffer(XLogRecPtr upto, bool opportunistic);
 static bool XLogCheckpointNeeded(XLogSegNo new_segno);
 static void XLogWrite(XLogwrtRqst WriteRqst, bool flexible);
 static bool InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
-								   bool find_free, XLogSegNo max_segno,
-								   bool use_lock);
+								   bool find_free, XLogSegNo max_segno);
 static int	XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
 						 XLogSource source, bool notfoundOk);
 static int	XLogFileReadAnyTLI(XLogSegNo segno, int emode, XLogSource source);
@@ -2509,7 +2508,7 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
 
 			/* create/use new log file */
 			use_existent = true;
-			openLogFile = XLogFileInit(openLogSegNo, &use_existent, true);
+			openLogFile = XLogFileInit(openLogSegNo, &use_existent);
 			ReserveExternalFD();
 		}
 
@@ -3263,10 +3262,6 @@ XLogNeedsFlush(XLogRecPtr record)
  * pre-existing file will be deleted).  On return, true if a pre-existing
  * file was used.
  *
- * use_lock: if true, acquire ControlFileLock while moving file into
- * place.  This should be true except during bootstrap log creation.  The
- * caller must *not* hold the lock at call.
- *
  * Returns FD of opened file.
  *
  * Note: errors here are ERROR not PANIC because we might or might not be
@@ -3275,7 +3270,7 @@ XLogNeedsFlush(XLogRecPtr record)
  * in a critical section.
  */
 int
-XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
+XLogFileInit(XLogSegNo logsegno, bool *use_existent)
 {
 	char		path[MAXPGPATH];
 	char		tmppath[MAXPGPATH];
@@ -3420,8 +3415,7 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
 	 */
 	max_segno = logsegno + CheckPointSegments;
 	if (!InstallXLogFileSegment(&installed_segno, tmppath,
-								*use_existent, max_segno,
-								use_lock))
+								*use_existent, max_segno))
 	{
 		/*
 		 * No need for any more future segments, or InstallXLogFileSegment()
@@ -3578,7 +3572,7 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
 	/*
 	 * Now move the segment into place with its final name.
 	 */
-	if (!InstallXLogFileSegment(&destsegno, tmppath, false, 0, false))
+	if (!InstallXLogFileSegment(&destsegno, tmppath, false, 0))
 		elog(ERROR, "InstallXLogFileSegment should not have failed");
 }
 
@@ -3602,29 +3596,20 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
  * free slot is found between *segno and max_segno. (Ignored when find_free
  * is false.)
  *
- * use_lock: if true, acquire ControlFileLock while moving file into
- * place.  This should be true except during bootstrap log creation.  The
- * caller must *not* hold the lock at call.
- *
  * Returns true if the file was installed successfully.  false indicates that
  * max_segno limit was exceeded, or an error occurred while renaming the
  * file into place.
  */
 static bool
 InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
-					   bool find_free, XLogSegNo max_segno,
-					   bool use_lock)
+					   bool find_free, XLogSegNo max_segno)
 {
 	char		path[MAXPGPATH];
 	struct stat stat_buf;
 
 	XLogFilePath(path, ThisTimeLineID, *segno, wal_segment_size);
 
-	/*
-	 * We want to be sure that only one process does this at a time.
-	 */
-	if (use_lock)
-		LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 
 	if (!find_free)
 	{
@@ -3639,8 +3624,7 @@ InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
 			if ((*segno) >= max_segno)
 			{
 				/* Failed to find a free slot within specified range */
-				if (use_lock)
-					LWLockRelease(ControlFileLock);
+				LWLockRelease(ControlFileLock);
 				return false;
 			}
 			(*segno)++;
@@ -3651,14 +3635,12 @@ InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
 	Assert(access(path, F_OK) != 0 && errno == ENOENT);
 	if (durable_rename(tmppath, path, LOG) != 0)
 	{
-		if (use_lock)
-			LWLockRelease(ControlFileLock);
+		LWLockRelease(ControlFileLock);
 		/* durable_rename already emitted log message */
 		return false;
 	}
 
-	if (use_lock)
-		LWLockRelease(ControlFileLock);
+	LWLockRelease(ControlFileLock);
 
 	return true;
 }
@@ -3929,7 +3911,7 @@ PreallocXlogFiles(XLogRecPtr endptr)
 	{
 		_logSegNo++;
 		use_existent = true;
-		lf = XLogFileInit(_logSegNo, &use_existent, true);
+		lf = XLogFileInit(_logSegNo, &use_existent);
 		close(lf);
 		if (!use_existent)
 			CheckpointStats.ckpt_segs_added++;
@@ -4206,7 +4188,7 @@ RemoveXlogFile(const char *segname, XLogRecPtr lastredoptr, XLogRecPtr endptr)
 		endlogSegNo <= recycleSegNo &&
 		lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) &&
 		InstallXLogFileSegment(&endlogSegNo, path,
-							   true, recycleSegNo, true))
+							   true, recycleSegNo))
 	{
 		ereport(DEBUG2,
 				(errmsg("recycled write-ahead log file \"%s\"",
@@ -5342,7 +5324,7 @@ BootStrapXLOG(void)
 
 	/* Create first XLOG segment file */
 	use_existent = false;
-	openLogFile = XLogFileInit(1, &use_existent, false);
+	openLogFile = XLogFileInit(1, &use_existent);
 
 	/*
 	 * We needn't bother with Reserve/ReleaseExternalFD here, since we'll
@@ -5651,7 +5633,7 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog)
 		bool		use_existent = true;
 		int			fd;
 
-		fd = XLogFileInit(startLogSegNo, &use_existent, true);
+		fd = XLogFileInit(startLogSegNo, &use_existent);
 
 		if (close(fd) != 0)
 		{
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 479c6c441bda..b874cf364f06 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -922,7 +922,7 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr)
 
 			/* Create/use new log file */
 			XLByteToSeg(recptr, recvSegNo, wal_segment_size);
-			recvFile = XLogFileInit(recvSegNo, &use_existent, true);
+			recvFile = XLogFileInit(recvSegNo, &use_existent);
 			recvFileTLI = ThisTimeLineID;
 		}
 
-- 
2.47.2

From e805d4e8b2bcfb7d72552ce914e480606c978bef Mon Sep 17 00:00:00 2001
From: Noah Misch <n...@leadboat.com>
Date: Mon, 28 Jun 2021 18:34:55 -0700
Subject: [PATCH v13 2/7] In XLogFileInit(), fix *use_existent postcondition to
 suit callers.

Infrequently, the mismatch caused log_checkpoints messages and
TRACE_POSTGRESQL_CHECKPOINT_DONE() to witness an "added" count too high
by one.  Since that consequence is so minor, no back-patch.

Discussion: https://postgr.es/m/20210202151416.gb3304...@rfd.leadboat.com
---
 src/backend/access/transam/xlog.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 859f86cfc3cd..28fd6da7d366 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3259,8 +3259,8 @@ XLogNeedsFlush(XLogRecPtr record)
  * logsegno: identify segment to be created/opened.
  *
  * *use_existent: if true, OK to use a pre-existing file (else, any
- * pre-existing file will be deleted).  On return, true if a pre-existing
- * file was used.
+ * pre-existing file will be deleted).  On return, false iff this call added
+ * some segment on disk.
  *
  * Returns FD of opened file.
  *
@@ -3414,8 +3414,10 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent)
 	 * CheckPointSegments.
 	 */
 	max_segno = logsegno + CheckPointSegments;
-	if (!InstallXLogFileSegment(&installed_segno, tmppath,
-								*use_existent, max_segno))
+	if (InstallXLogFileSegment(&installed_segno, tmppath,
+							   *use_existent, max_segno))
+		*use_existent = false;
+	else
 	{
 		/*
 		 * No need for any more future segments, or InstallXLogFileSegment()
@@ -3425,9 +3427,6 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent)
 		unlink(tmppath);
 	}
 
-	/* Set flag to tell caller there was no existent file */
-	*use_existent = false;
-
 	/* Now open original target segment (might not be file I just made) */
 	fd = BasicOpenFile(path, O_RDWR | PG_BINARY | get_sync_bit(sync_method));
 	if (fd < 0)
-- 
2.47.2

From 3944585703be84b60e7ff1c45c2578ee312aa736 Mon Sep 17 00:00:00 2001
From: Noah Misch <n...@leadboat.com>
Date: Mon, 28 Jun 2021 18:34:56 -0700
Subject: [PATCH v13 3/7] Remove XLogFileInit() ability to unlink a
 pre-existing file.

Only initdb used it.  initdb refuses to operate on a non-empty directory
and generally does not cope with pre-existing files of other kinds.
Hence, use the opportunity to simplify.

Discussion: https://postgr.es/m/20210202151416.gb3304...@rfd.leadboat.com
---
 src/include/access/xlog.h             |  2 +-
 src/backend/access/transam/xlog.c     | 61 +++++++++++----------------
 src/backend/replication/walreceiver.c |  4 +-
 3 files changed, 28 insertions(+), 39 deletions(-)

diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 7cb147a7ee37..6119452ba8cd 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -286,7 +286,7 @@ extern XLogRecPtr XLogInsertRecord(struct XLogRecData *rdata,
 extern void XLogFlush(XLogRecPtr RecPtr);
 extern bool XLogBackgroundFlush(void);
 extern bool XLogNeedsFlush(XLogRecPtr RecPtr);
-extern int	XLogFileInit(XLogSegNo segno, bool *use_existent);
+extern int	XLogFileInit(XLogSegNo segno, bool *added);
 extern int	XLogFileOpen(XLogSegNo segno);
 
 extern void CheckXLogRemoved(XLogSegNo segno, TimeLineID tli);
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 28fd6da7d366..6fbb5328fd0b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -2440,7 +2440,7 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
 	bool		ispartialpage;
 	bool		last_iteration;
 	bool		finishing_seg;
-	bool		use_existent;
+	bool		added;
 	int			curridx;
 	int			npages;
 	int			startidx;
@@ -2507,8 +2507,7 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
 							wal_segment_size);
 
 			/* create/use new log file */
-			use_existent = true;
-			openLogFile = XLogFileInit(openLogSegNo, &use_existent);
+			openLogFile = XLogFileInit(openLogSegNo, &added);
 			ReserveExternalFD();
 		}
 
@@ -3258,9 +3257,7 @@ XLogNeedsFlush(XLogRecPtr record)
  *
  * logsegno: identify segment to be created/opened.
  *
- * *use_existent: if true, OK to use a pre-existing file (else, any
- * pre-existing file will be deleted).  On return, false iff this call added
- * some segment on disk.
+ * *added: on return, true if this call raised the number of extant segments.
  *
  * Returns FD of opened file.
  *
@@ -3270,7 +3267,7 @@ XLogNeedsFlush(XLogRecPtr record)
  * in a critical section.
  */
 int
-XLogFileInit(XLogSegNo logsegno, bool *use_existent)
+XLogFileInit(XLogSegNo logsegno, bool *added)
 {
 	char		path[MAXPGPATH];
 	char		tmppath[MAXPGPATH];
@@ -3286,19 +3283,17 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent)
 	/*
 	 * Try to use existent file (checkpoint maker may have created it already)
 	 */
-	if (*use_existent)
+	*added = false;
+	fd = BasicOpenFile(path, O_RDWR | PG_BINARY | get_sync_bit(sync_method));
+	if (fd < 0)
 	{
-		fd = BasicOpenFile(path, O_RDWR | PG_BINARY | get_sync_bit(sync_method));
-		if (fd < 0)
-		{
-			if (errno != ENOENT)
-				ereport(ERROR,
-						(errcode_for_file_access(),
-						 errmsg("could not open file \"%s\": %m", path)));
-		}
-		else
-			return fd;
+		if (errno != ENOENT)
+			ereport(ERROR,
+					(errcode_for_file_access(),
+					 errmsg("could not open file \"%s\": %m", path)));
 	}
+	else
+		return fd;
 
 	/*
 	 * Initialize an empty (all zeroes) segment.  NOTE: it is possible that
@@ -3395,12 +3390,9 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent)
 				 errmsg("could not close file \"%s\": %m", tmppath)));
 
 	/*
-	 * Now move the segment into place with its final name.
-	 *
-	 * If caller didn't want to use a pre-existing file, get rid of any
-	 * pre-existing file.  Otherwise, cope with possibility that someone else
-	 * has created the file while we were filling ours: if so, use ours to
-	 * pre-create a future log segment.
+	 * Now move the segment into place with its final name.  Cope with
+	 * possibility that someone else has created the file while we were
+	 * filling ours: if so, use ours to pre-create a future log segment.
 	 */
 	installed_segno = logsegno;
 
@@ -3414,9 +3406,8 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent)
 	 * CheckPointSegments.
 	 */
 	max_segno = logsegno + CheckPointSegments;
-	if (InstallXLogFileSegment(&installed_segno, tmppath,
-							   *use_existent, max_segno))
-		*use_existent = false;
+	if (InstallXLogFileSegment(&installed_segno, tmppath, true, max_segno))
+		*added = true;
 	else
 	{
 		/*
@@ -3901,7 +3892,7 @@ PreallocXlogFiles(XLogRecPtr endptr)
 {
 	XLogSegNo	_logSegNo;
 	int			lf;
-	bool		use_existent;
+	bool		added;
 	uint64		offset;
 
 	XLByteToPrevSeg(endptr, _logSegNo, wal_segment_size);
@@ -3909,10 +3900,9 @@ PreallocXlogFiles(XLogRecPtr endptr)
 	if (offset >= (uint32) (0.75 * wal_segment_size))
 	{
 		_logSegNo++;
-		use_existent = true;
-		lf = XLogFileInit(_logSegNo, &use_existent);
+		lf = XLogFileInit(_logSegNo, &added);
 		close(lf);
-		if (!use_existent)
+		if (added)
 			CheckpointStats.ckpt_segs_added++;
 	}
 }
@@ -5225,7 +5215,7 @@ BootStrapXLOG(void)
 	XLogLongPageHeader longpage;
 	XLogRecord *record;
 	char	   *recptr;
-	bool		use_existent;
+	bool		added;
 	uint64		sysidentifier;
 	struct timeval tv;
 	pg_crc32c	crc;
@@ -5322,8 +5312,7 @@ BootStrapXLOG(void)
 	record->xl_crc = crc;
 
 	/* Create first XLOG segment file */
-	use_existent = false;
-	openLogFile = XLogFileInit(1, &use_existent);
+	openLogFile = XLogFileInit(1, &added);
 
 	/*
 	 * We needn't bother with Reserve/ReleaseExternalFD here, since we'll
@@ -5629,10 +5618,10 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog)
 		 * The switch happened at a segment boundary, so just create the next
 		 * segment on the new timeline.
 		 */
-		bool		use_existent = true;
+		bool		added;
 		int			fd;
 
-		fd = XLogFileInit(startLogSegNo, &use_existent);
+		fd = XLogFileInit(startLogSegNo, &added);
 
 		if (close(fd) != 0)
 		{
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index b874cf364f06..976dbc04cde7 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -918,11 +918,11 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr)
 
 		if (recvFile < 0)
 		{
-			bool		use_existent = true;
+			bool		added = true;
 
 			/* Create/use new log file */
 			XLByteToSeg(recptr, recvSegNo, wal_segment_size);
-			recvFile = XLogFileInit(recvSegNo, &use_existent);
+			recvFile = XLogFileInit(recvSegNo, &added);
 			recvFileTLI = ThisTimeLineID;
 		}
 
-- 
2.47.2

From 994fd5e49f76e40b9eae6d02d0a463d5048e9302 Mon Sep 17 00:00:00 2001
From: Michael Paquier <mich...@paquier.xyz>
Date: Mon, 10 Mar 2025 13:18:52 +0900
Subject: [PATCH v13 4/7] Revert "Add HINT for restartpoint race with
 KeepFileRestoredFromArchive()."

This reverts commit 8ad6c5dbbe5a234c55c6663020db297251756006.
---
 src/backend/access/transam/xlog.c |  5 +----
 src/backend/storage/file/fd.c     | 10 ++--------
 2 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6fbb5328fd0b..4ef417f1eddb 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3423,10 +3423,7 @@ XLogFileInit(XLogSegNo logsegno, bool *added)
 	if (fd < 0)
 		ereport(ERROR,
 				(errcode_for_file_access(),
-				 errmsg("could not open file \"%s\": %m", path),
-				 (AmCheckpointerProcess() ?
-				  errhint("This is known to fail occasionally during archive recovery, where it is harmless.") :
-				  0)));
+				 errmsg("could not open file \"%s\": %m", path)));
 
 	elog(DEBUG2, "done creating and filling new WAL file");
 
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 452ab88cafad..7aa50a5d2e57 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -800,10 +800,7 @@ durable_rename_excl(const char *oldfile, const char *newfile, int elevel)
 		ereport(elevel,
 				(errcode_for_file_access(),
 				 errmsg("could not link file \"%s\" to \"%s\": %m",
-						oldfile, newfile),
-				 (AmCheckpointerProcess() ?
-				  errhint("This is known to fail occasionally during archive recovery, where it is harmless.") :
-				  0)));
+						oldfile, newfile)));
 		return -1;
 	}
 	unlink(oldfile);
@@ -813,10 +810,7 @@ durable_rename_excl(const char *oldfile, const char *newfile, int elevel)
 		ereport(elevel,
 				(errcode_for_file_access(),
 				 errmsg("could not rename file \"%s\" to \"%s\": %m",
-						oldfile, newfile),
-				 (AmCheckpointerProcess() ?
-				  errhint("This is known to fail occasionally during archive recovery, where it is harmless.") :
-				  0)));
+						oldfile, newfile)));
 		return -1;
 	}
 #endif
-- 
2.47.2

From 1ab26db206a00ce2a4b24996b7b13ae841662dfc Mon Sep 17 00:00:00 2001
From: Noah Misch <n...@leadboat.com>
Date: Mon, 28 Jun 2021 18:34:56 -0700
Subject: [PATCH v13 5/7] Don't ERROR on PreallocXlogFiles() race condition.

Before a restartpoint finishes PreallocXlogFiles(), a startup process
KeepFileRestoredFromArchive() call can unlink the preallocated segment.
If a CHECKPOINT sql command had elicited the restartpoint experiencing
the race condition, that sql command failed.  Moreover, the restartpoint
omitted its log_checkpoints message and some inessential resource
reclamation.  Prevent the ERROR by skipping open() of the segment.
Since these consequences are so minor, no back-patch.

Discussion: https://postgr.es/m/20210202151416.gb3304...@rfd.leadboat.com
---
 src/include/access/xlog.h             |  2 +-
 src/backend/access/transam/xlog.c     | 79 +++++++++++++++++++--------
 src/backend/replication/walreceiver.c |  4 +-
 3 files changed, 58 insertions(+), 27 deletions(-)

diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 6119452ba8cd..7526378d57f0 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -286,7 +286,7 @@ extern XLogRecPtr XLogInsertRecord(struct XLogRecData *rdata,
 extern void XLogFlush(XLogRecPtr RecPtr);
 extern bool XLogBackgroundFlush(void);
 extern bool XLogNeedsFlush(XLogRecPtr RecPtr);
-extern int	XLogFileInit(XLogSegNo segno, bool *added);
+extern int	XLogFileInit(XLogSegNo segno);
 extern int	XLogFileOpen(XLogSegNo segno);
 
 extern void CheckXLogRemoved(XLogSegNo segno, TimeLineID tli);
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 4ef417f1eddb..bbc9dd87b40a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -2440,7 +2440,6 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
 	bool		ispartialpage;
 	bool		last_iteration;
 	bool		finishing_seg;
-	bool		added;
 	int			curridx;
 	int			npages;
 	int			startidx;
@@ -2507,7 +2506,7 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
 							wal_segment_size);
 
 			/* create/use new log file */
-			openLogFile = XLogFileInit(openLogSegNo, &added);
+			openLogFile = XLogFileInit(openLogSegNo);
 			ReserveExternalFD();
 		}
 
@@ -3253,23 +3252,21 @@ XLogNeedsFlush(XLogRecPtr record)
 }
 
 /*
- * Create a new XLOG file segment, or open a pre-existing one.
+ * Try to make a given XLOG file segment exist.
  *
- * logsegno: identify segment to be created/opened.
+ * logsegno: identify segment.
  *
  * *added: on return, true if this call raised the number of extant segments.
  *
- * Returns FD of opened file.
+ * path: on return, this char[MAXPGPATH] has the path to the logsegno file.
  *
- * Note: errors here are ERROR not PANIC because we might or might not be
- * inside a critical section (eg, during checkpoint there is no reason to
- * take down the system on failure).  They will promote to PANIC if we are
- * in a critical section.
+ * Returns -1 or FD of opened file.  A -1 here is not an error; a caller
+ * wanting an open segment should attempt to open "path", which usually will
+ * succeed.  (This is weird, but it's efficient for the callers.)
  */
-int
-XLogFileInit(XLogSegNo logsegno, bool *added)
+static int
+XLogFileInitInternal(XLogSegNo logsegno, bool *added, char *path)
 {
-	char		path[MAXPGPATH];
 	char		tmppath[MAXPGPATH];
 	PGAlignedXLogBlock zbuffer;
 	XLogSegNo	installed_segno;
@@ -3407,26 +3404,53 @@ XLogFileInit(XLogSegNo logsegno, bool *added)
 	 */
 	max_segno = logsegno + CheckPointSegments;
 	if (InstallXLogFileSegment(&installed_segno, tmppath, true, max_segno))
+	{
 		*added = true;
+		elog(DEBUG2, "done creating and filling new WAL file");
+	}
 	else
 	{
 		/*
 		 * No need for any more future segments, or InstallXLogFileSegment()
-		 * failed to rename the file into place. If the rename failed, opening
-		 * the file below will fail.
+		 * failed to rename the file into place. If the rename failed, a
+		 * caller opening the file may fail.
 		 */
 		unlink(tmppath);
+		elog(DEBUG2, "abandoned new WAL file");
 	}
 
+	return -1;
+}
+
+/*
+ * Create a new XLOG file segment, or open a pre-existing one.
+ *
+ * logsegno: identify segment to be created/opened.
+ *
+ * Returns FD of opened file.
+ *
+ * Note: errors here are ERROR not PANIC because we might or might not be
+ * inside a critical section (eg, during checkpoint there is no reason to
+ * take down the system on failure).  They will promote to PANIC if we are
+ * in a critical section.
+ */
+int
+XLogFileInit(XLogSegNo logsegno)
+{
+	bool		ignore_added;
+	char		path[MAXPGPATH];
+	int			fd;
+
+	fd = XLogFileInitInternal(logsegno, &ignore_added, path);
+	if (fd >= 0)
+		return fd;
+
 	/* Now open original target segment (might not be file I just made) */
 	fd = BasicOpenFile(path, O_RDWR | PG_BINARY | get_sync_bit(sync_method));
 	if (fd < 0)
 		ereport(ERROR,
 				(errcode_for_file_access(),
 				 errmsg("could not open file \"%s\": %m", path)));
-
-	elog(DEBUG2, "done creating and filling new WAL file");
-
 	return fd;
 }
 
@@ -3883,6 +3907,15 @@ XLogFileClose(void)
  * High-volume systems will be OK once they've built up a sufficient set of
  * recycled log segments, but the startup transient is likely to include
  * a lot of segment creations by foreground processes, which is not so good.
+ *
+ * XLogFileInitInternal() can ereport(ERROR).  All known causes indicate big
+ * trouble; for example, a full filesystem is one cause.  The checkpoint WAL
+ * and/or ControlFile updates already completed.  If a RequestCheckpoint()
+ * initiated the present checkpoint and an ERROR ends this function, the
+ * command that called RequestCheckpoint() fails.  That's not ideal, but it's
+ * not worth contorting more functions to use caller-specified elevel values.
+ * (With or without RequestCheckpoint(), an ERROR forestalls some inessential
+ * reporting and resource reclamation.)
  */
 static void
 PreallocXlogFiles(XLogRecPtr endptr)
@@ -3890,6 +3923,7 @@ PreallocXlogFiles(XLogRecPtr endptr)
 	XLogSegNo	_logSegNo;
 	int			lf;
 	bool		added;
+	char		path[MAXPGPATH];
 	uint64		offset;
 
 	XLByteToPrevSeg(endptr, _logSegNo, wal_segment_size);
@@ -3897,8 +3931,9 @@ PreallocXlogFiles(XLogRecPtr endptr)
 	if (offset >= (uint32) (0.75 * wal_segment_size))
 	{
 		_logSegNo++;
-		lf = XLogFileInit(_logSegNo, &added);
-		close(lf);
+		lf = XLogFileInitInternal(_logSegNo, &added, path);
+		if (lf >= 0)
+			close(lf);
 		if (added)
 			CheckpointStats.ckpt_segs_added++;
 	}
@@ -5212,7 +5247,6 @@ BootStrapXLOG(void)
 	XLogLongPageHeader longpage;
 	XLogRecord *record;
 	char	   *recptr;
-	bool		added;
 	uint64		sysidentifier;
 	struct timeval tv;
 	pg_crc32c	crc;
@@ -5309,7 +5343,7 @@ BootStrapXLOG(void)
 	record->xl_crc = crc;
 
 	/* Create first XLOG segment file */
-	openLogFile = XLogFileInit(1, &added);
+	openLogFile = XLogFileInit(1);
 
 	/*
 	 * We needn't bother with Reserve/ReleaseExternalFD here, since we'll
@@ -5615,10 +5649,9 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog)
 		 * The switch happened at a segment boundary, so just create the next
 		 * segment on the new timeline.
 		 */
-		bool		added;
 		int			fd;
 
-		fd = XLogFileInit(startLogSegNo, &added);
+		fd = XLogFileInit(startLogSegNo);
 
 		if (close(fd) != 0)
 		{
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 976dbc04cde7..e1d7c0f75435 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -918,11 +918,9 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr)
 
 		if (recvFile < 0)
 		{
-			bool		added = true;
-
 			/* Create/use new log file */
 			XLByteToSeg(recptr, recvSegNo, wal_segment_size);
-			recvFile = XLogFileInit(recvSegNo, &added);
+			recvFile = XLogFileInit(recvSegNo);
 			recvFileTLI = ThisTimeLineID;
 		}
 
-- 
2.47.2

From 74f9824b0a2db22f0fabaa6059d6a8d33dd4e72b Mon Sep 17 00:00:00 2001
From: Noah Misch <n...@leadboat.com>
Date: Mon, 28 Jun 2021 18:34:56 -0700
Subject: [PATCH v13 6/7] Skip WAL recycling and preallocation during archive
 recovery.

The previous commit addressed the chief consequences of a race condition
between InstallXLogFileSegment() and KeepFileRestoredFromArchive().  Fix
three lesser consequences.  A spurious durable_rename_excl() LOG message
remained possible.  KeepFileRestoredFromArchive() wasted the proceeds of
WAL recycling and preallocation.  Finally, XLogFileInitInternal() could
return a descriptor for a file that KeepFileRestoredFromArchive() had
already unlinked.  That felt like a recipe for future bugs.

Discussion: https://postgr.es/m/20210202151416.gb3304...@rfd.leadboat.com
---
 src/backend/access/transam/xlog.c | 65 +++++++++++++++++++++++++++----
 1 file changed, 57 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index bbc9dd87b40a..d585af79a6ec 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -675,6 +675,16 @@ typedef struct XLogCtlData
 	 */
 	bool		SharedHotStandbyActive;
 
+	/*
+	 * InstallXLogFileSegmentActive indicates whether the checkpointer should
+	 * arrange for future segments by recycling and/or PreallocXlogFiles().
+	 * Protected by ControlFileLock.  Only the startup process changes it.  If
+	 * true, anyone can use InstallXLogFileSegment().  If false, the startup
+	 * process owns the exclusive right to install segments, by reading from
+	 * the archive and possibly replacing existing files.
+	 */
+	bool		InstallXLogFileSegmentActive;
+
 	/*
 	 * SharedPromoteIsTriggered indicates if a standby promotion has been
 	 * triggered.  Protected by info_lck.
@@ -925,6 +935,7 @@ static int	XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr,
 						 int reqLen, XLogRecPtr targetRecPtr, char *readBuf);
 static bool WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 										bool fetching_ckpt, XLogRecPtr tliRecPtr);
+static void XLogShutdownWalRcv(void);
 static int	emode_for_corrupt_record(int emode, XLogRecPtr RecPtr);
 static void XLogFileClose(void);
 static void PreallocXlogFiles(XLogRecPtr endptr);
@@ -3608,8 +3619,8 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno,
  * is false.)
  *
  * Returns true if the file was installed successfully.  false indicates that
- * max_segno limit was exceeded, or an error occurred while renaming the
- * file into place.
+ * max_segno limit was exceeded, the startup process has disabled this
+ * function for now, or an error occurred while renaming the file into place.
  */
 static bool
 InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
@@ -3621,6 +3632,11 @@ InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
 	XLogFilePath(path, ThisTimeLineID, *segno, wal_segment_size);
 
 	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	if (!XLogCtl->InstallXLogFileSegmentActive)
+	{
+		LWLockRelease(ControlFileLock);
+		return false;
+	}
 
 	if (!find_free)
 	{
@@ -3725,6 +3741,7 @@ XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
 	 */
 	if (source == XLOG_FROM_ARCHIVE)
 	{
+		Assert(!XLogCtl->InstallXLogFileSegmentActive);
 		KeepFileRestoredFromArchive(path, xlogfname);
 
 		/*
@@ -3926,6 +3943,9 @@ PreallocXlogFiles(XLogRecPtr endptr)
 	char		path[MAXPGPATH];
 	uint64		offset;
 
+	if (!XLogCtl->InstallXLogFileSegmentActive)
+		return;					/* unlocked check says no */
+
 	XLByteToPrevSeg(endptr, _logSegNo, wal_segment_size);
 	offset = XLogSegmentOffset(endptr - 1, wal_segment_size);
 	if (offset >= (uint32) (0.75 * wal_segment_size))
@@ -4207,6 +4227,7 @@ RemoveXlogFile(const char *segname, XLogRecPtr lastredoptr, XLogRecPtr endptr)
 	 */
 	if (wal_recycle &&
 		endlogSegNo <= recycleSegNo &&
+		XLogCtl->InstallXLogFileSegmentActive &&	/* callee rechecks this */
 		lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) &&
 		InstallXLogFileSegment(&endlogSegNo, path,
 							   true, recycleSegNo))
@@ -4220,7 +4241,7 @@ RemoveXlogFile(const char *segname, XLogRecPtr lastredoptr, XLogRecPtr endptr)
 	}
 	else
 	{
-		/* No need for any more future segments... */
+		/* No need for any more future segments, or recycling failed ... */
 		int			rc;
 
 		ereport(DEBUG2,
@@ -5225,6 +5246,7 @@ XLOGShmemInit(void)
 	XLogCtl->XLogCacheBlck = XLOGbuffers - 1;
 	XLogCtl->SharedRecoveryState = RECOVERY_STATE_CRASH;
 	XLogCtl->SharedHotStandbyActive = false;
+	XLogCtl->InstallXLogFileSegmentActive = false;
 	XLogCtl->SharedPromoteIsTriggered = false;
 	XLogCtl->WalWriterSleeping = false;
 
@@ -5251,6 +5273,11 @@ BootStrapXLOG(void)
 	struct timeval tv;
 	pg_crc32c	crc;
 
+	/* allow ordinary WAL segment creation, like StartupXLOG() would */
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	XLogCtl->InstallXLogFileSegmentActive = true;
+	LWLockRelease(ControlFileLock);
+
 	/*
 	 * Select a hopefully-unique system identifier code for this installation.
 	 * We use the result of gettimeofday(), including the fractional seconds
@@ -7531,7 +7558,7 @@ StartupXLOG(void)
 	 * over these records and subsequent ones if it's still alive when we
 	 * start writing WAL.
 	 */
-	ShutdownWalRcv();
+	XLogShutdownWalRcv();
 
 	/*
 	 * Reset unlogged relations to the contents of their INIT fork. This is
@@ -7556,7 +7583,7 @@ StartupXLOG(void)
 	 * recovery, e.g., timeline history file) from archive or pg_wal.
 	 *
 	 * Note that standby mode must be turned off after killing WAL receiver,
-	 * i.e., calling ShutdownWalRcv().
+	 * i.e., calling XLogShutdownWalRcv().
 	 */
 	Assert(!WalRcvStreaming());
 	StandbyMode = false;
@@ -7625,6 +7652,14 @@ StartupXLOG(void)
 	 */
 	oldestActiveXID = PrescanPreparedTransactions(NULL, NULL);
 
+	/*
+	 * Allow ordinary WAL segment creation before any exitArchiveRecovery(),
+	 * which sometimes creates a segment, and after the last ReadRecord().
+	 */
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	XLogCtl->InstallXLogFileSegmentActive = true;
+	LWLockRelease(ControlFileLock);
+
 	/*
 	 * Consider whether we need to assign a new timeline ID.
 	 *
@@ -12491,7 +12526,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 					 */
 					if (StandbyMode && CheckForStandbyTrigger())
 					{
-						ShutdownWalRcv();
+						XLogShutdownWalRcv();
 						return false;
 					}
 
@@ -12539,7 +12574,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 					 * WAL that we restore from archive.
 					 */
 					if (WalRcvStreaming())
-						ShutdownWalRcv();
+						XLogShutdownWalRcv();
 
 					/*
 					 * Before we sleep, re-scan for possible new timelines if
@@ -12669,7 +12704,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 					 */
 					if (pendingWalRcvRestart && !startWalReceiver)
 					{
-						ShutdownWalRcv();
+						XLogShutdownWalRcv();
 
 						/*
 						 * Re-scan for possible new timelines if we were
@@ -12720,6 +12755,9 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 									 tli, curFileTLI);
 						}
 						curFileTLI = tli;
+						LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+						XLogCtl->InstallXLogFileSegmentActive = true;
+						LWLockRelease(ControlFileLock);
 						RequestXLogStreaming(tli, ptr, PrimaryConnInfo,
 											 PrimarySlotName,
 											 wal_receiver_create_temp_slot);
@@ -12882,6 +12920,17 @@ StartupRequestWalReceiverRestart(void)
 	}
 }
 
+/* Thin wrapper around ShutdownWalRcv(). */
+static void
+XLogShutdownWalRcv(void)
+{
+	ShutdownWalRcv();
+
+	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+	XLogCtl->InstallXLogFileSegmentActive = false;
+	LWLockRelease(ControlFileLock);
+}
+
 /*
  * Determine what log level should be used to report a corrupt WAL record
  * in the current WAL page, previously read by XLogPageRead().
-- 
2.47.2

From e1eedb586974de10baf149c128987ad2dde447bc Mon Sep 17 00:00:00 2001
From: Noah Misch <n...@leadboat.com>
Date: Thu, 15 Sep 2022 06:45:23 -0700
Subject: [PATCH v13 7/7] Reset InstallXLogFileSegmentActive after walreceiver
 self-initiated exit.

After commit cc2c7d65fc27e877c9f407587b0b92d46cd6dd16 added this flag,
failure to reset it caused assertion failures.  In non-assert builds, it
made the system fail to achieve the objectives listed in that commit;
chiefly, we might emit a spurious log message.  Back-patch to v15, where
that commit first appeared.

Bharath Rupireddy and Kyotaro Horiguchi.  Reviewed by Dilip Kumar,
Nathan Bossart and Michael Paquier.  Reported by Dilip Kumar.

Discussion: https://postgr.es/m/CAFiTN-sE3ry=ycMPVtC+Djw4Fd7gbUGVv_qqw6qfzp=jlvq...@mail.gmail.com
---
 src/backend/access/transam/xlog.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d585af79a6ec..fc104a377ad9 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -12573,8 +12573,7 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 					 * walreceiver is not active, so that it won't overwrite
 					 * WAL that we restore from archive.
 					 */
-					if (WalRcvStreaming())
-						XLogShutdownWalRcv();
+					XLogShutdownWalRcv();
 
 					/*
 					 * Before we sleep, re-scan for possible new timelines if
-- 
2.47.2

Attachment: signature.asc
Description: PGP signature

Reply via email to