Thanks for the feedback.

> > > IIUC partial WAL files are handled because the next file in the
> > > sequence with the given TimeLineID won't be there, so we will fall
> > > back to a directory scan and pick it up.  Timeline history files are
> > > handled by forcing a directory scan, which should work because they
> > > always have the highest priority.  Backup history files, however, do
> > > not seem to be handled.  I think one approach to fixing that is to
> > > also treat backup history files similarly to timeline history files.
> > > If one is created, we force a directory scan, and the directory scan
> > > logic will consider backup history files as higher priority than
> > > everything but timeline history files.
> >
> > Backup history files are (currently) just informational and they are
> > finally processed at the end of a bulk-archiving performed by the fast
> > path.  However, I feel that it is cleaner to trigger a directory scan
> > every time we add an other-than-a-regular-WAL-file, as base-backup or
> - promotion are not supposed happen so infrequently.
> + promotion are not supposed happen so frequently.

I have incorporated the changes to trigger a directory scan in case of a
backup history file. Also, updated archiver to prioritize archiving a backup
history file over regular WAL files during directory scan to make sure that
backup history file gets archived before the directory scan gets disabled
as part of archiving a regular WAL file.

> > I've been looking at the v9 patch with fresh eyes, and I still think
> > we should be able to force the directory scan as needed in
> > XLogArchiveNotify().  Unless the file to archive is a regular WAL file
> > that is > our stored location in archiver memory, we should force a
> > directory scan.  I think it needs to be > instead of >= because we
> > don't know if the archiver has just completed a directory scan and
> > found a later segment to use to update the archiver state (but hasn't
> > yet updated the state in shared memory).
>
> I'm afraid that it can be seen as a violation of modularity. I feel
> that wal-emitter side should not be aware of that datail of
> archiving. Instead, I would prefer to keep directory scan as far as it
> found an smaller segment id than the next-expected segment id ever
> archived by the fast-path (if possible).  This would be
> less-performant in the case out-of-order segments are frequent but I
> think the overall objective of the original patch will be kept.

Archiver selects the file with lowest segment number as part of directory
scan and the next segment number gets resets based on this file. It starts
a new sequence from here and check the availability of the next file. If
there are holes then it will continue to fall back to directory scan. This
will
continue until it finds the next sequence in order. I think this is already
handled unless I am missing something.

> Also, I think we need to make sure to set PgArch->dirScan back to true
> > at the end of pgarch_readyXlog() unless we've found a new regular WAL
> > file that we can use to reset the archiver's stored location.  This
> > ensures that we'll keep doing directory scans as long as there are
> > timeline/backup history files to process.
>
> Right.

Done.

Please find the attached patch v10.

Thanks,
Dipesh
From c0a4bf3937934e576db49f7a30530dcf71a068d6 Mon Sep 17 00:00:00 2001
From: Dipesh Pandit <dipesh.pan...@enterprisedb.com>
Date: Tue, 24 Aug 2021 12:17:34 +0530
Subject: [PATCH] mitigate directory scan for WAL archiver

WAL archiver scans the status directory to identify the next WAL
file that needs to be archived. This directory scan can be minimised
by maintaining the log segment of current wAL file which is being
archived and incrementing it by '1' to get the next WAL file.
Archiver can check the availability of next file and in case if the
file is not available then it should fall-back to directory scan to
get the oldest WAL file.

There are some special scenarios like timeline switch, backup or
.ready file created out of order which requires archiver to perform a
full directory scan as archiving these files takes precedence over
regular WAL files.
---
 src/backend/access/transam/xlogarchive.c |  23 ++++-
 src/backend/postmaster/pgarch.c          | 170 ++++++++++++++++++++++++++++---
 src/include/postmaster/pgarch.h          |   1 +
 3 files changed, 178 insertions(+), 16 deletions(-)

diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index b9c19b2..ececd10 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -465,6 +465,8 @@ KeepFileRestoredFromArchive(const char *path, const char *xlogfname)
  *
  * Optionally, nudge the archiver process so that it'll notice the file we
  * create.
+ *
+ * Notify archiver to enable directory scan to handle a few special scenarios.
  */
 void
 XLogArchiveNotify(const char *xlog, bool nudge)
@@ -492,6 +494,13 @@ XLogArchiveNotify(const char *xlog, bool nudge)
 		return;
 	}
 
+	/*
+	 * History files requires archiver to perform a full directory scan to
+	 * archive these files immediately.
+	 */
+	if (IsTLHistoryFileName(xlog) || IsBackupHistoryFileName(xlog))
+		PgArchEnableDirScan();
+
 	/* If caller requested, let archiver know it's got work to do */
 	if (nudge)
 		PgArchWakeup();
@@ -611,7 +620,19 @@ XLogArchiveCheckDone(const char *xlog)
 		return true;
 
 	/* Retry creation of the .ready file */
-	XLogArchiveNotify(xlog, true);
+	XLogArchiveNotify(xlog, false);
+
+	/*
+	 * This .ready file is created out of order, notify archiver to perform
+	 * a full directory scan to archive corresponding WAL file.
+	 */
+	StatusFilePath(archiveStatusPath, xlog, ".ready");
+	if (stat(archiveStatusPath, &stat_buf) == 0)
+	{
+		PgArchEnableDirScan();
+		PgArchWakeup();
+	}
+
 	return false;
 }
 
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 74a7d7c..802f17b 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -76,8 +76,25 @@
 typedef struct PgArchData
 {
 	int			pgprocno;		/* pgprocno of archiver process */
+
+	/*
+	 * Flag to enable/disable directory scan. If this flag is set then it
+	 * forces archiver to perform a full directory scan to get the next log
+	 * segment. It is not required to synchronize this flag as it guarantees
+	 * directory scan for the next cycle even if it is being missed in current
+	 * cycle.
+	 */
+	bool		dirScan;
 } PgArchData;
 
+/*
+ * Segment number and timeline ID to identify the next file in a WAL sequence
+ */
+typedef struct readyXLogState
+{
+	XLogSegNo	lastSegNo;
+	TimeLineID	lastTLI;
+} readyXLogState;
 
 /* ----------
  * Local data
@@ -97,9 +114,9 @@ static volatile sig_atomic_t ready_to_stop = false;
  */
 static void pgarch_waken_stop(SIGNAL_ARGS);
 static void pgarch_MainLoop(void);
-static void pgarch_ArchiverCopyLoop(void);
+static void pgarch_ArchiverCopyLoop(readyXLogState *xlogState);
 static bool pgarch_archiveXlog(char *xlog);
-static bool pgarch_readyXlog(char *xlog);
+static bool pgarch_readyXlog(char *xlog, readyXLogState *xlogState);
 static void pgarch_archiveDone(char *xlog);
 static void pgarch_die(int code, Datum arg);
 static void HandlePgArchInterrupts(void);
@@ -221,6 +238,15 @@ PgArchWakeup(void)
 		SetLatch(&ProcGlobal->allProcs[arch_pgprocno].procLatch);
 }
 
+/*
+ * Set dirScan flag in shared memory. Backend notifies archiver in case if an
+ * action requires full directory scan to get the next log segment.
+ */
+void
+PgArchEnableDirScan(void)
+{
+	PgArch->dirScan = true;
+}
 
 /* SIGUSR2 signal handler for archiver process */
 static void
@@ -243,10 +269,21 @@ pgarch_waken_stop(SIGNAL_ARGS)
 static void
 pgarch_MainLoop(void)
 {
+	readyXLogState xlogState;
 	pg_time_t	last_copy_time = 0;
 	bool		time_to_stop;
 
 	/*
+	 * Initialize xlogState, segment number and TLI will be reset/updated in
+	 * function pgarch_readyXlog() for each cycle.
+	 */
+	xlogState.lastSegNo = 0;
+	xlogState.lastTLI = 0;
+
+	/* First cycle after startup */
+	PgArchEnableDirScan();
+
+	/*
 	 * There shouldn't be anything for the archiver to do except to wait for a
 	 * signal ... however, the archiver exists to protect our data, so she
 	 * wakes up occasionally to allow herself to be proactive.
@@ -280,7 +317,7 @@ pgarch_MainLoop(void)
 		}
 
 		/* Do what we're here for */
-		pgarch_ArchiverCopyLoop();
+		pgarch_ArchiverCopyLoop(&xlogState);
 		last_copy_time = time(NULL);
 
 		/*
@@ -321,7 +358,7 @@ pgarch_MainLoop(void)
  * Archives all outstanding xlogs then returns
  */
 static void
-pgarch_ArchiverCopyLoop(void)
+pgarch_ArchiverCopyLoop(readyXLogState *xlogState)
 {
 	char		xlog[MAX_XFN_CHARS + 1];
 
@@ -331,7 +368,7 @@ pgarch_ArchiverCopyLoop(void)
 	 * some backend will add files onto the list of those that need archiving
 	 * while we are still copying earlier archives
 	 */
-	while (pgarch_readyXlog(xlog))
+	while (pgarch_readyXlog(xlog, xlogState))
 	{
 		int			failures = 0;
 		int			failures_orphan = 0;
@@ -596,29 +633,93 @@ pgarch_archiveXlog(char *xlog)
  * larger ID; the net result being that past timelines are given higher
  * priority for archiving.  This seems okay, or at least not obviously worth
  * changing.
+ *
+ * WAL files are generated in a specific order of log segment number. The
+ * directory scan for each WAL file can be minimised by identifying the next
+ * WAL file in the sequence. This can be achieved by maintaining log segment
+ * number and timeline ID corresponding to WAL file currently being archived.
+ * The log segment number of current WAL file can be incremented by '1' to
+ * point to the next WAL file in a sequence. Full directory scan can be avoided
+ * by checking the availability of next WAL file. "xlogState" specifies the
+ * segment number and timeline ID corresponding to the next WAL file.
+ *
+ * However, a full directory scan is performed in some special cases where it
+ * requires us to archive files which takes precedence over the next anticipated
+ * log segment. For example, history file takes precedence over archiving WAL
+ * files on older timeline or an older WAL file which is being left out because
+ * corresponding .ready file is created out of order or archiving a backup
+ * history file created during backup.
+ *
+ * Returns "true" if a segment is ready for archival, "xlog" represents the
+ * name of the segment.
  */
 static bool
-pgarch_readyXlog(char *xlog)
+pgarch_readyXlog(char *xlog, readyXLogState *xlogState)
 {
-	/*
-	 * open xlog status directory and read through list of xlogs that have the
-	 * .ready suffix, looking for earliest file. It is possible to optimise
-	 * this code, though only a single file is expected on the vast majority
-	 * of calls, so....
-	 */
+	char		basename[MAX_XFN_CHARS + 1];
+	char		xlogready[MAXPGPATH];
 	char		XLogArchiveStatusDir[MAXPGPATH];
 	DIR		   *rldir;
 	struct dirent *rlde;
+	struct stat	st;
 	bool		found = false;
 	bool		historyFound = false;
 
+	/*
+	 * Skip directory scan until it is not indicated by shared memory flag
+	 * dirScan.
+	 */
+	if (!PgArch->dirScan)
+	{
+		/*
+		 * We already have the next anticipated log segment and timeline, check
+		 * if this WAL is ready to be archived.
+		 */
+		XLogFileName(basename, xlogState->lastTLI, xlogState->lastSegNo, wal_segment_size);
+		StatusFilePath(xlogready, basename, ".ready");
+
+		if (stat(xlogready, &st) == 0)
+		{
+			strcpy(xlog, basename);
+
+			/*
+			 * Increment the readyXLogState's lastSegNo to point to the next
+			 * WAL file. Although we have not yet archived the current WAL file
+			 * and readyXLogState points to the next WAL file, this is safe
+			 * because the next cycle will not begin until we finish archiving
+			 * current WAL file.
+			 */
+			xlogState->lastSegNo++;
+			return true;
+		}
+	}
+
+	/*
+	 * Perform a full directory scan to identify the next log segment. There
+	 * may be one of the following scenarios which may require us to perform a
+	 * full directory scan.
+	 *
+	 * - This is the first cycle since archiver has started and there is no
+	 *   idea about the next anticipated log segment.
+	 *
+	 * - There is a timeline switch, archive history file as part of this
+	 *   timeline switch.
+	 *
+	 * - .ready file is created out of order.
+	 *
+	 * - A backup history file created during backup.
+	 *
+	 * - The next anticipated log segment is not available.
+	 *
+	 * open xlog status directory and read through list of xlogs that have the
+	 * .ready suffix, looking for earliest file.
+	 */
 	snprintf(XLogArchiveStatusDir, MAXPGPATH, XLOGDIR "/archive_status");
 	rldir = AllocateDir(XLogArchiveStatusDir);
 
 	while ((rlde = ReadDir(rldir, XLogArchiveStatusDir)) != NULL)
 	{
 		int			basenamelen = (int) strlen(rlde->d_name) - 6;
-		char		basename[MAX_XFN_CHARS + 1];
 		bool		ishistory;
 
 		/* Ignore entries with unexpected number of characters */
@@ -638,8 +739,14 @@ pgarch_readyXlog(char *xlog)
 		memcpy(basename, rlde->d_name, basenamelen);
 		basename[basenamelen] = '\0';
 
-		/* Is this a history file? */
-		ishistory = IsTLHistoryFileName(basename);
+		/*
+		 * Archiving timeline history file takes precedence over WAL file but
+		 * if this directory scan has been enabled to archive a backup history
+		 * file then archive the backup history file on priority before the
+		 * directory scan gets disabled as part archiving regular WAL file.
+		 */
+		ishistory = IsTLHistoryFileName(basename) ||
+			IsBackupHistoryFileName(basename);
 
 		/*
 		 * Consume the file to archive.  History files have the highest
@@ -661,6 +768,39 @@ pgarch_readyXlog(char *xlog)
 				strcpy(xlog, basename);
 		}
 	}
+
+	if (found)
+	{
+		if (IsXLogFileName(xlog))
+		{
+			/*
+			 * Reset the flag only when we found a regular WAL file to make
+			 * sure that we are done with processing history files.
+			 */
+			PgArch->dirScan = false;
+
+			/*
+			 * Make sure that the flag reset is flushed to memory before it is
+			 * examined or set for the next cycle.
+			 */
+			pg_memory_barrier();
+
+			/*
+			 * Reset segment number and timeline ID as this is the beginning of a
+			 * new sequence.
+			 */
+			XLogFromFileName(xlog, &xlogState->lastTLI, &xlogState->lastSegNo,
+					wal_segment_size);
+
+			/* Increment log segment number to point to the next WAL file */
+			xlogState->lastSegNo++;
+		}
+
+		ereport(LOG,
+				(errmsg("directory scan to archive write-ahead log file \"%s\"",
+						xlog)));
+	}
+
 	FreeDir(rldir);
 
 	return found;
diff --git a/src/include/postmaster/pgarch.h b/src/include/postmaster/pgarch.h
index 1e47a14..265f7e5 100644
--- a/src/include/postmaster/pgarch.h
+++ b/src/include/postmaster/pgarch.h
@@ -31,5 +31,6 @@ extern void PgArchShmemInit(void);
 extern bool PgArchCanRestart(void);
 extern void PgArchiverMain(void) pg_attribute_noreturn();
 extern void PgArchWakeup(void);
+extern void PgArchEnableDirScan(void);
 
 #endif							/* _PGARCH_H */
-- 
1.8.3.1

Reply via email to