Hi All,

Attaching patch to support a new feature that let pg_waldump decode
WAL files directly from a tar archive. This was worked to address a
limitation in pg_verifybackup[1], which couldn't parse WAL files from
tar-formatted backups.

The implementation will align with pg_waldump's existing xlogreader
design, which uses three callback functions to manage WAL segments:
open, read, and close. For tar archives, however, the approach will be
simpler. Instead of using separate callbacks for opening and closing,
the tar archive will be opened once at the start and closed explicitly
at the end.

The core logic will be in the WAL page reading callback. When
xlogreader requests a new WAL page, this callback will be invoked. It
will then call the archive streamer routine to read the WAL data from
the tar archive into a buffer. This data will then be copied into
xlogreader's own buffer, completing the read.

Essentially, this is plumbing work: the new code will be responsible
for getting WAL data from the tar archive and feeding it to the
existing xlogreader. All other WAL page and record decoding logic,
which is already robust within xlogreader, will be reused as is.

This feature is being implemented in a series of patches as:

- Refactoring: The first few patches (0001-0004) are dedicated to
refactoring and minor code changes.

- 005: This patch introduces the core functionality for pg_waldump to
read WAL from a tar archive using the same archive streamer
(fe_utils/astreamer.h) used in pg_verifybackup. This version requires
WAL files in the archive to be in sequential order.

- 006: This patch removes the sequential order restriction. If
pg_waldump encounters an out-of-order WAL file, it writes the file to
a temporary directory. The utility will then continue decoding and
read from this temporary location later.

- 007 and onwards: These patches will update pg_verifybackup to remove the
restriction on WAL parsing for tar-formatted backups. 008 patch renames the
"--wal-directory" switch to "--wal-path" to make it more generic, allowing
it accepts a directory path or a tar archive path.

-----------------------------------
Known Issues & Status:
-----------------------------------
- Timeline Switching: The current implementation in patch 006 does not
correctly handle timeline switching. This is a known issue, especially
when a timeline change occurs on a WAL file that has been written to a
temporary location.

- Testing: Local regression tests on CentOS and macOS M4 are passing.
However, some tests on macOS Sonoma (specifically 008_untar.pl and
010_client_untar.pl) are failing in the GitHub workflow with a "WAL
parsing failed for timeline 1" error. This issue is currently being
investigated.

Please take a look at the attached patch and let me know your
thoughts. This is an initial version, and I am making incremental
improvements to address known issues and limitations.


1] 
https://git.postgresql.org/pg/commitdiff/8dfd3129027969fdd2d9d294220c867d2efd84aa

--
Regards,
Amul Sul
EDB: http://www.enterprisedb.com
From 420ab4e05566f81fb15488ae7060b9d5648994b5 Mon Sep 17 00:00:00 2001
From: Amul Sul <sulamul@gmail.com>
Date: Tue, 24 Jun 2025 11:33:20 +0530
Subject: [PATCH v1 1/9] Refactor: pg_waldump: Move some declarations to new
 pg_waldump.h

This is in preparation for adding a second source file to this
directory.
---
 src/bin/pg_waldump/pg_waldump.c | 11 ++---------
 src/bin/pg_waldump/pg_waldump.h | 27 +++++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 9 deletions(-)
 create mode 100644 src/bin/pg_waldump/pg_waldump.h

diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 13d3ec2f5be..a49b2fd96c7 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -29,6 +29,7 @@
 #include "common/logging.h"
 #include "common/relpath.h"
 #include "getopt_long.h"
+#include "pg_waldump.h"
 #include "rmgrdesc.h"
 #include "storage/bufpage.h"
 
@@ -39,19 +40,11 @@
 
 static const char *progname;
 
-static int	WalSegSz;
+int			WalSegSz = DEFAULT_XLOG_SEG_SIZE;
 static volatile sig_atomic_t time_to_stop = false;
 
 static const RelFileLocator emptyRelFileLocator = {0, 0, 0};
 
-typedef struct XLogDumpPrivate
-{
-	TimeLineID	timeline;
-	XLogRecPtr	startptr;
-	XLogRecPtr	endptr;
-	bool		endptr_reached;
-} XLogDumpPrivate;
-
 typedef struct XLogDumpConfig
 {
 	/* display options */
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
new file mode 100644
index 00000000000..cd9a36d7447
--- /dev/null
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_waldump.h - decode and display WAL
+ *
+ * Copyright (c) 2013-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		  src/bin/pg_waldump/pg_waldump.h
+ *-------------------------------------------------------------------------
+ */
+#ifndef PG_WALDUMP_H
+#define PG_WALDUMP_H
+
+#include "access/xlogdefs.h"
+
+extern int WalSegSz;
+
+/* Contains the necessary information to drive WAL decoding */
+typedef struct XLogDumpPrivate
+{
+	TimeLineID	timeline;
+	XLogRecPtr	startptr;
+	XLogRecPtr	endptr;
+	bool		endptr_reached;
+} XLogDumpPrivate;
+
+#endif		/* end of PG_WALDUMP_H */
-- 
2.47.1

From 30a226b1ae5ce3d1460bb3359c96a4e9a93d6b31 Mon Sep 17 00:00:00 2001
From: Amul Sul <sulamul@gmail.com>
Date: Thu, 26 Jun 2025 11:42:53 +0530
Subject: [PATCH v1 2/9] Refactor: pg_waldump: Separate logic used to calculate
 the required read size.

This refactoring prepares the codebase for an upcoming patch that will
support reading WAL from tar files. The logic for calculating the
required read size has been updated to handle both normal WAL files
and WAL files located inside a tar archive.
---
 src/bin/pg_waldump/pg_waldump.c | 39 ++++++++++++++++++++++-----------
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index a49b2fd96c7..8d0cd9e7156 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -326,6 +326,29 @@ identify_target_directory(char *directory, char *fname)
 	return NULL;				/* not reached */
 }
 
+/* Returns the size in bytes of the data to be read. */
+static inline int
+required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
+				  int reqLen)
+{
+	int			count = XLOG_BLCKSZ;
+
+	if (private->endptr != InvalidXLogRecPtr)
+	{
+		if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
+			count = XLOG_BLCKSZ;
+		else if (targetPagePtr + reqLen <= private->endptr)
+			count = private->endptr - targetPagePtr;
+		else
+		{
+			private->endptr_reached = true;
+			return -1;
+		}
+	}
+
+	return count;
+}
+
 /* pg_waldump's XLogReaderRoutine->segment_open callback */
 static void
 WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
@@ -383,21 +406,11 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
 				XLogRecPtr targetPtr, char *readBuff)
 {
 	XLogDumpPrivate *private = state->private_data;
-	int			count = XLOG_BLCKSZ;
+	int			count = required_read_len(private, targetPagePtr, reqLen);
 	WALReadError errinfo;
 
-	if (private->endptr != InvalidXLogRecPtr)
-	{
-		if (targetPagePtr + XLOG_BLCKSZ <= private->endptr)
-			count = XLOG_BLCKSZ;
-		else if (targetPagePtr + reqLen <= private->endptr)
-			count = private->endptr - targetPagePtr;
-		else
-		{
-			private->endptr_reached = true;
-			return -1;
-		}
-	}
+	if (private->endptr_reached)
+		return -1;
 
 	if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline,
 				 &errinfo))
-- 
2.47.1

From 9e9121433ff4394238698bd27b3411daace5fd86 Mon Sep 17 00:00:00 2001
From: Amul Sul <sulamul@gmail.com>
Date: Wed, 30 Jul 2025 12:43:30 +0530
Subject: [PATCH v1 3/9] Refactor: pg_waldump: Restructure TAP tests.

Restructured some tests to run inside a loop, facilitating their
re-execution for decoding WAL from tar archives.
---
 src/bin/pg_waldump/t/001_basic.pl | 123 ++++++++++++++++--------------
 1 file changed, 67 insertions(+), 56 deletions(-)

diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index f26d75e01cf..1b712e8d74d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -198,28 +198,6 @@ command_like(
 	],
 	qr/./,
 	'runs with start and end segment specified');
-command_fails_like(
-	[ 'pg_waldump', '--path' => $node->data_dir ],
-	qr/error: no start WAL location given/,
-	'path option requires start location');
-command_like(
-	[
-		'pg_waldump',
-		'--path' => $node->data_dir,
-		'--start' => $start_lsn,
-		'--end' => $end_lsn,
-	],
-	qr/./,
-	'runs with path option and start and end locations');
-command_fails_like(
-	[
-		'pg_waldump',
-		'--path' => $node->data_dir,
-		'--start' => $start_lsn,
-	],
-	qr/error: error in WAL record at/,
-	'falling off the end of the WAL results in an error');
-
 command_like(
 	[
 		'pg_waldump', '--quiet',
@@ -227,15 +205,6 @@ command_like(
 	],
 	qr/^$/,
 	'no output with --quiet option');
-command_fails_like(
-	[
-		'pg_waldump', '--quiet',
-		'--path' => $node->data_dir,
-		'--start' => $start_lsn
-	],
-	qr/error: error in WAL record at/,
-	'errors are shown with --quiet');
-
 
 # Test for: Display a message that we're skipping data if `from`
 # wasn't a pointer to the start of a record.
@@ -272,7 +241,6 @@ sub test_pg_waldump
 
 	my $result = IPC::Run::run [
 		'pg_waldump',
-		'--path' => $node->data_dir,
 		'--start' => $start_lsn,
 		'--end' => $end_lsn,
 		@opts
@@ -288,38 +256,81 @@ sub test_pg_waldump
 
 my @lines;
 
-@lines = test_pg_waldump;
-is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
+my @scenario = (
+	{
+		'path' => $node->data_dir
+	});
 
-@lines = test_pg_waldump('--limit' => 6);
-is(@lines, 6, 'limit option observed');
+for my $scenario (@scenario)
+{
+	my $path = $scenario->{'path'};
 
-@lines = test_pg_waldump('--fullpage');
-is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
+	SKIP:
+	{
+		command_fails_like(
+			[ 'pg_waldump', '--path' => $path ],
+			qr/error: no start WAL location given/,
+			'path option requires start location');
+		command_like(
+			[
+				'pg_waldump',
+				'--path' => $path,
+				'--start' => $start_lsn,
+				'--end' => $end_lsn,
+			],
+			qr/./,
+			'runs with path option and start and end locations');
+		command_fails_like(
+			[
+				'pg_waldump',
+				'--path' => $path,
+				'--start' => $start_lsn,
+			],
+			qr/error: error in WAL record at/,
+			'falling off the end of the WAL results in an error');
 
-@lines = test_pg_waldump('--stats');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+		command_fails_like(
+			[
+				'pg_waldump', '--quiet',
+				'--path' => $path,
+				'--start' => $start_lsn
+			],
+			qr/error: error in WAL record at/,
+			'errors are shown with --quiet');
 
-@lines = test_pg_waldump('--stats=record');
-like($lines[0], qr/WAL statistics/, "statistics on stdout");
-is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
+		@lines = test_pg_waldump('--path' => $path);
+		is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
 
-@lines = test_pg_waldump('--rmgr' => 'Btree');
-is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+		@lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+		is(@lines, 6, 'limit option observed');
 
-@lines = test_pg_waldump('--fork' => 'init');
-is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+		@lines = test_pg_waldump('--path' => $path, '--fullpage');
+		is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
 
-@lines = test_pg_waldump(
-	'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
-is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
-	0, 'only lines for selected relation');
+		@lines = test_pg_waldump('--path' => $path, '--stats');
+		like($lines[0], qr/WAL statistics/, "statistics on stdout");
+		is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
 
-@lines = test_pg_waldump(
-	'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
-	'--block' => 1);
-is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+		@lines = test_pg_waldump('--path' => $path, '--stats=record');
+		like($lines[0], qr/WAL statistics/, "statistics on stdout");
+		is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
 
+		@lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+		is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
+
+		@lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+		is(grep(!/fork init/, @lines), 0, 'only init fork lines');
+
+		@lines = test_pg_waldump('--path' => $path,
+			'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
+		is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
+			0, 'only lines for selected relation');
+
+		@lines = test_pg_waldump('--path' => $path,
+			'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
+			'--block' => 1);
+		is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+	}
+}
 
 done_testing();
-- 
2.47.1

From 58062914fb56aa4f8e005dbd24e072251f3150b6 Mon Sep 17 00:00:00 2001
From: Amul Sul <sulamul@gmail.com>
Date: Tue, 29 Jul 2025 14:59:01 +0530
Subject: [PATCH v1 4/9] pg_waldump: Rename directory creation routine for
 generalized use.

The create_fullpage_directory() function, currently used only for
storing full-page images from WAL records, should be renamed to a more
generalized name. This would allow it to be reused in future patches
for creating other directories as needed.
---
 src/bin/pg_waldump/pg_waldump.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 8d0cd9e7156..4775275c07a 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -114,11 +114,11 @@ verify_directory(const char *directory)
 }
 
 /*
- * Create if necessary the directory storing the full-page images extracted
- * from the WAL records read.
+ * Create the directory if it doesn't exist. Report an error if creation fails
+ * or if an existing directory is not empty.
  */
 static void
-create_fullpage_directory(char *path)
+create_directory(char *path)
 {
 	int			ret;
 
@@ -1112,8 +1112,12 @@ main(int argc, char **argv)
 		}
 	}
 
+	/*
+	 * Create if necessary the directory storing the full-page images
+	 * extracted from the WAL records read.
+	 */
 	if (config.save_fullpage_path != NULL)
-		create_fullpage_directory(config.save_fullpage_path);
+		create_directory(config.save_fullpage_path);
 
 	/* parse files as start/end boundaries, extract path if not specified */
 	if (optind < argc)
-- 
2.47.1

From 21d9d604ca4b4ab08c5bc32decf1afc8d881c43c Mon Sep 17 00:00:00 2001
From: Amul Sul <sulamul@gmail.com>
Date: Wed, 16 Jul 2025 18:37:59 +0530
Subject: [PATCH v1 5/9] pg_waldump: Add support for archived WAL decoding.

pg_waldump can now accept the path to a tar archive containing WAL
files and decode them. This feature was added primarily for
pg_verifybackup, which previously disabled WAL parsing for
tar-formatted backups.

Note that this patch requires that the WAL files within the archive be
in sequential order; an error will be reported otherwise. The next
patch is planned to remove this restriction.
---
 doc/src/sgml/ref/pg_waldump.sgml       |   8 +-
 src/bin/pg_waldump/Makefile            |   7 +-
 src/bin/pg_waldump/astreamer_waldump.c | 378 +++++++++++++++++++++++++
 src/bin/pg_waldump/meson.build         |   4 +-
 src/bin/pg_waldump/pg_waldump.c        | 361 +++++++++++++++++++----
 src/bin/pg_waldump/pg_waldump.h        |  21 +-
 src/bin/pg_waldump/t/001_basic.pl      |  64 ++++-
 src/tools/pgindent/typedefs.list       |   1 +
 8 files changed, 765 insertions(+), 79 deletions(-)
 create mode 100644 src/bin/pg_waldump/astreamer_waldump.c

diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index ce23add5577..d004bb0f67e 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -141,13 +141,17 @@ PostgreSQL documentation
       <term><option>--path=<replaceable>path</replaceable></option></term>
       <listitem>
        <para>
-        Specifies a directory to search for WAL segment files or a
-        directory with a <literal>pg_wal</literal> subdirectory that
+        Specifies a tar archive or a directory to search for WAL segment files
+        or a directory with a <literal>pg_wal</literal> subdirectory that
         contains such files.  The default is to search in the current
         directory, the <literal>pg_wal</literal> subdirectory of the
         current directory, and the <literal>pg_wal</literal> subdirectory
         of <envar>PGDATA</envar>.
        </para>
+       <para>
+        If a tar archive is provided, its WAL segment files must be in
+        sequential order; otherwise, an error will be reported.
+       </para>
       </listitem>
      </varlistentry>
 
diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile
index 4c1ee649501..b234613eb50 100644
--- a/src/bin/pg_waldump/Makefile
+++ b/src/bin/pg_waldump/Makefile
@@ -3,6 +3,9 @@
 PGFILEDESC = "pg_waldump - decode and display WAL"
 PGAPPICON=win32
 
+# make these available to TAP test scripts
+export TAR
+
 subdir = src/bin/pg_waldump
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
@@ -12,11 +15,13 @@ OBJS = \
 	$(WIN32RES) \
 	compat.o \
 	pg_waldump.o \
+	astreamer_waldump.o \
 	rmgrdesc.o \
 	xlogreader.o \
 	xlogstats.o
 
-override CPPFLAGS := -DFRONTEND $(CPPFLAGS)
+override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS)
+LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils
 
 RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c)))
 RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES))
diff --git a/src/bin/pg_waldump/astreamer_waldump.c b/src/bin/pg_waldump/astreamer_waldump.c
new file mode 100644
index 00000000000..d0ac903c54e
--- /dev/null
+++ b/src/bin/pg_waldump/astreamer_waldump.c
@@ -0,0 +1,378 @@
+/*-------------------------------------------------------------------------
+ *
+ * astreamer_waldump.c
+ *		A generic facility for reading WAL data from tar archives via archive
+ *		streamer.
+ *
+ * Portions Copyright (c) 2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		src/bin/pg_waldump/astreamer_waldump.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres_fe.h"
+
+#include <unistd.h>
+
+#include "access/xlog_internal.h"
+#include "access/xlogdefs.h"
+#include "common/logging.h"
+#include "fe_utils/simple_list.h"
+#include "pg_waldump.h"
+
+/*
+ * How many bytes should we try to read from a file at once?
+ */
+#define READ_CHUNK_SIZE				(128 * 1024)
+
+typedef struct astreamer_waldump
+{
+	/* These fields don't change once initialized. */
+	astreamer	base;
+	XLogSegNo	startSegNo;
+	XLogSegNo	endSegNo;
+	XLogDumpPrivate *privateInfo;
+
+	/* These fields change with archive member. */
+	bool		skipThisSeg;
+	XLogSegNo	nextSegNo;		/* Next expected segment to stream */
+} astreamer_waldump;
+
+static int	astreamer_archive_read(XLogDumpPrivate *privateInfo);
+static void astreamer_waldump_content(astreamer *streamer,
+									  astreamer_member *member,
+									  const char *data, int len,
+									  astreamer_archive_context context);
+static void astreamer_waldump_finalize(astreamer *streamer);
+static void astreamer_waldump_free(astreamer *streamer);
+
+static bool member_is_relevant_wal(astreamer_member *member,
+								   TimeLineID startTimeLineID,
+								   XLogSegNo startSegNo,
+								   XLogSegNo endSegNo,
+								   XLogSegNo nextSegNo,
+								   XLogSegNo *curSegNo,
+								   TimeLineID *curSegTimeline);
+
+static const astreamer_ops astreamer_waldump_ops = {
+	.content = astreamer_waldump_content,
+	.finalize = astreamer_waldump_finalize,
+	.free = astreamer_waldump_free
+};
+
+/*
+ * Copies WAL data from astreamer to readBuff; if unavailable, fetches more
+ * from the tar archive via astreamer.
+ */
+int
+astreamer_wal_read(char *readBuff, XLogRecPtr targetPagePtr, Size count,
+				   XLogDumpPrivate *privateInfo)
+{
+	char	   *p = readBuff;
+	Size		nbytes = count;
+	XLogRecPtr	recptr = targetPagePtr;
+	volatile StringInfo astreamer_buf = privateInfo->archive_streamer_buf;
+
+	while (nbytes > 0)
+	{
+		char	   *buf = astreamer_buf->data;
+		int			len = astreamer_buf->len;
+
+		/* WAL record range that the buffer contains */
+		XLogRecPtr	endPtr = privateInfo->archive_streamer_read_ptr;
+		XLogRecPtr	startPtr = (endPtr > len) ? endPtr - len : 0;
+
+		/*
+		 * Ignore existing data if the required target page has not yet been
+		 * read.
+		 */
+		if (recptr >= endPtr)
+		{
+			len = 0;
+
+			/* Reset the buffer */
+			resetStringInfo(astreamer_buf);
+		}
+
+		if (len > 0 && recptr > startPtr)
+		{
+			int			skipBytes = 0;
+
+			/*
+			 * The required offset is not at the start of the archive streamer
+			 * buffer, so skip bytes until reaching the desired offset of the
+			 * target page.
+			 */
+			skipBytes = recptr - startPtr;
+
+			buf += skipBytes;
+			len -= skipBytes;
+		}
+
+		if (len > 0)
+		{
+			int			readBytes = len >= nbytes ? nbytes : len;
+
+			/*
+			 * Ensure we are reading the correct page, unless we've received an
+			 * invalid record pointer. In that specific case, it's acceptable
+			 * to read any page.
+			 */
+			Assert(XLogRecPtrIsInvalid(recptr) ||
+				   (recptr >= startPtr && recptr < endPtr));
+
+			memcpy(p, buf, readBytes);
+
+			/* Update state for read */
+			nbytes -= readBytes;
+			p += readBytes;
+			recptr += readBytes;
+		}
+		else
+		{
+			/* Fetch more data */
+			if (astreamer_archive_read(privateInfo) == 0)
+				break;			/* No data remaining */
+		}
+	}
+
+	return (count - nbytes) ? (count - nbytes) : -1;
+}
+
+/*
+ * Reads the archive and passes it to the archive streamer for decompression.
+ */
+static int
+astreamer_archive_read(XLogDumpPrivate *privateInfo)
+{
+	int			rc;
+	char	   *buffer;
+
+	buffer = pg_malloc(READ_CHUNK_SIZE * sizeof(uint8));
+
+	/* Read more data from the tar file */
+	rc = read(privateInfo->archive_fd, buffer, READ_CHUNK_SIZE);
+	if (rc < 0)
+		pg_fatal("could not read file \"%s\": %m",
+				 privateInfo->archive_name);
+
+	/*
+	 * Decrypt (if required), and then parse the previously read contents of
+	 * the tar file.
+	 */
+	if (rc > 0)
+		astreamer_content(privateInfo->archive_streamer, NULL,
+						  buffer, rc, ASTREAMER_UNKNOWN);
+	pg_free(buffer);
+
+	return rc;
+}
+
+/*
+ * Create an astreamer that can read WAL from tar file.
+ */
+astreamer *
+astreamer_waldump_content_new(astreamer *next, XLogRecPtr startptr,
+							  XLogRecPtr endPtr, XLogDumpPrivate *privateInfo)
+{
+	astreamer_waldump *streamer;
+
+	streamer = palloc0(sizeof(astreamer_waldump));
+	*((const astreamer_ops **) &streamer->base.bbs_ops) =
+		&astreamer_waldump_ops;
+
+	streamer->base.bbs_next = next;
+	initStringInfo(&streamer->base.bbs_buffer);
+
+	if (XLogRecPtrIsInvalid(startptr))
+		streamer->startSegNo = 0;
+	else
+	{
+		XLByteToSeg(startptr, streamer->startSegNo, WalSegSz);
+
+		/*
+		 * Initialize the record pointer to the beginning of the first
+		 * segment; this pointer will track the WAL record reading status.
+		 */
+		XLogSegNoOffsetToRecPtr(streamer->startSegNo, 0, WalSegSz,
+								privateInfo->archive_streamer_read_ptr);
+	}
+
+	if (XLogRecPtrIsInvalid(endPtr))
+		streamer->endSegNo = UINT64_MAX;
+	else
+		XLByteToSeg(endPtr, streamer->endSegNo, WalSegSz);
+
+	streamer->nextSegNo = streamer->startSegNo;
+	streamer->privateInfo = privateInfo;
+
+	return &streamer->base;
+}
+
+/*
+ * Main entry point of the archive streamer for reading WAL from a tar file.
+ */
+static void
+astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
+						  const char *data, int len,
+						  astreamer_archive_context context)
+{
+	astreamer_waldump *mystreamer = (astreamer_waldump *) streamer;
+	XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+	Assert(context != ASTREAMER_UNKNOWN);
+
+	switch (context)
+	{
+		case ASTREAMER_MEMBER_HEADER:
+			{
+				XLogSegNo	segNo;
+				TimeLineID	timeline;
+
+				pg_log_debug("pg_waldump: reading \"%s\"", member->pathname);
+
+				mystreamer->skipThisSeg = false;
+
+				if (!member_is_relevant_wal(member,
+											privateInfo->timeline,
+											mystreamer->startSegNo,
+											mystreamer->endSegNo,
+											mystreamer->nextSegNo,
+											&segNo, &timeline))
+				{
+					mystreamer->skipThisSeg = true;
+					break;
+				}
+
+				/*
+				 * If nextSegNo is 0, the check is skipped, and any WAL file
+				 * can be read -- this typically occurs during initial
+				 * verification.
+				 */
+				if (mystreamer->nextSegNo == 0)
+					break;
+
+				/* WAL segments must be archived in order */
+				if (mystreamer->nextSegNo != segNo)
+				{
+					pg_log_error("WAL files are not archived in sequential order");
+					pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
+										mystreamer->nextSegNo, segNo);
+					exit(1);
+				}
+
+				/*
+				 * We track the reading of WAL segment records using a pointer
+				 * that's continuously incremented by the length of the
+				 * received data. This pointer is crucial for serving WAL page
+				 * requests from the WAL decoding routine, so it must be
+				 * accurate.
+				 */
+#ifdef USE_ASSERT_CHECKING
+				if (mystreamer->nextSegNo != 0)
+				{
+					XLogRecPtr	recPtr;
+
+					XLogSegNoOffsetToRecPtr(segNo, 0, WalSegSz, recPtr);
+					Assert(privateInfo->archive_streamer_read_ptr == recPtr);
+				}
+#endif
+
+				/* Save the timeline */
+				privateInfo->timeline = timeline;
+
+				/* Update the next expected segment number */
+				mystreamer->nextSegNo += 1;
+			}
+			break;
+
+		case ASTREAMER_MEMBER_CONTENTS:
+			/* Skip this segment */
+			if (mystreamer->skipThisSeg)
+				break;
+
+			/* Or, copy contents to buffer */
+			privateInfo->archive_streamer_read_ptr += len;
+			astreamer_buffer_bytes(streamer, &data, &len, len);
+			break;
+
+		case ASTREAMER_MEMBER_TRAILER:
+			break;
+
+		case ASTREAMER_ARCHIVE_TRAILER:
+			break;
+
+		default:
+			/* Shouldn't happen. */
+			pg_fatal("unexpected state while parsing tar file");
+	}
+}
+
+/*
+ * End-of-stream processing for a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_finalize(astreamer *streamer)
+{
+	Assert(streamer->bbs_next == NULL);
+}
+
+/*
+ * Free memory associated with a astreamer_waldump stream.
+ */
+static void
+astreamer_waldump_free(astreamer *streamer)
+{
+	Assert(streamer->bbs_next == NULL);
+
+	pfree(streamer->bbs_buffer.data);
+	pfree(streamer);
+}
+
+/*
+ * Returns true if the archive member name matches the WAL naming format and
+ * the corresponding WAL segment falls within the WAL decoding target range;
+ * otherwise, returns false.
+ */
+static bool
+member_is_relevant_wal(astreamer_member *member, TimeLineID startTimeLineID,
+					   XLogSegNo startSegNo, XLogSegNo endSegNo,
+					   XLogSegNo nextSegNo, XLogSegNo *curSegNo,
+					   TimeLineID *curSegTimeline)
+{
+	int			pathlen;
+	XLogSegNo	segNo;
+	TimeLineID	timeline;
+	char	   *fname;
+
+	/* We are only interested in normal files. */
+	if (member->is_directory || member->is_link)
+		return false;
+
+	pathlen = strlen(member->pathname);
+	if (pathlen < XLOG_FNAME_LEN)
+		return false;
+
+	/* WAL file could be with full path */
+	fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+	if (!IsXLogFileName(fname))
+		return false;
+
+	/* Parse position from file */
+	XLogFromFileName(fname, &timeline, &segNo, WalSegSz);
+
+	/* Ignore the older timeline */
+	if (startTimeLineID > timeline)
+		return false;
+
+	/* Skip if the current segment is not the desired one */
+	if (startSegNo > segNo || endSegNo < segNo)
+		return false;
+
+	*curSegNo = segNo;
+	*curSegTimeline = timeline;
+
+	return true;
+}
diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 937e0d68841..2a0300dc339 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -3,6 +3,7 @@
 pg_waldump_sources = files(
   'compat.c',
   'pg_waldump.c',
+  'astreamer_waldump.c',
   'rmgrdesc.c',
 )
 
@@ -18,7 +19,7 @@ endif
 
 pg_waldump = executable('pg_waldump',
   pg_waldump_sources,
-  dependencies: [frontend_code, lz4, zstd],
+  dependencies: [frontend_code, lz4, zstd, libpq],
   c_args: ['-DFRONTEND'], # needed for xlogreader et al
   kwargs: default_bin_args,
 )
@@ -29,6 +30,7 @@ tests += {
   'sd': meson.current_source_dir(),
   'bd': meson.current_build_dir(),
   'tap': {
+    'env': {'TAR': tar.found() ? tar.full_path() : ''},
     'tests': [
       't/001_basic.pl',
       't/002_save_fullpage.pl',
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 4775275c07a..64f3a65b735 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -182,10 +182,9 @@ open_file_in_directory(const char *directory, const char *fname)
 {
 	int			fd = -1;
 	char		fpath[MAXPGPATH];
+	char	   *dir = directory ? (char *) directory : ".";
 
-	Assert(directory != NULL);
-
-	snprintf(fpath, MAXPGPATH, "%s/%s", directory, fname);
+	snprintf(fpath, MAXPGPATH, "%s/%s", dir, fname);
 	fd = open(fpath, O_RDONLY | PG_BINARY, 0);
 
 	if (fd < 0 && errno != ENOENT)
@@ -326,6 +325,160 @@ identify_target_directory(char *directory, char *fname)
 	return NULL;				/* not reached */
 }
 
+/*
+ * Returns true if the given file is a tar archive and outputs its compression
+ * algorithm.
+ */
+static bool
+is_tar_file(const char *fname, pg_compress_algorithm *compression)
+{
+	int			fname_len = strlen(fname);
+	pg_compress_algorithm compress_algo;
+
+	/* Now, check the compression type of the tar */
+	if (fname_len > 4 &&
+		strcmp(fname + fname_len - 4, ".tar") == 0)
+		compress_algo = PG_COMPRESSION_NONE;
+	else if (fname_len > 4 &&
+			 strcmp(fname + fname_len - 4, ".tgz") == 0)
+		compress_algo = PG_COMPRESSION_GZIP;
+	else if (fname_len > 7 &&
+			 strcmp(fname + fname_len - 7, ".tar.gz") == 0)
+		compress_algo = PG_COMPRESSION_GZIP;
+	else if (fname_len > 8 &&
+			 strcmp(fname + fname_len - 8, ".tar.lz4") == 0)
+		compress_algo = PG_COMPRESSION_LZ4;
+	else if (fname_len > 8 &&
+			 strcmp(fname + fname_len - 8, ".tar.zst") == 0)
+		compress_algo = PG_COMPRESSION_ZSTD;
+	else
+		return false;
+
+	*compression = compress_algo;
+
+	return true;
+}
+
+/*
+ * Creates an appropriate chain of archive streamers for reading the given
+ * tar archive.
+ */
+static void
+setup_astreamer(XLogDumpPrivate *private, pg_compress_algorithm compression,
+				XLogRecPtr startptr, XLogRecPtr endptr)
+{
+	astreamer  *streamer = NULL;
+
+	streamer = astreamer_waldump_content_new(NULL, startptr, endptr, private);
+
+	/*
+	 * Final extracted WAL data will reside in this streamer. However, since
+	 * it sits at the bottom of the stack and isn't designed to propagate data
+	 * upward, we need to hold a pointer to its data buffer in order to copy.
+	 */
+	private->archive_streamer_buf = &streamer->bbs_buffer;
+
+	/* Before that we must parse the tar archive. */
+	streamer = astreamer_tar_parser_new(streamer);
+
+	/* Before that we must decompress, if archive is compressed. */
+	if (compression == PG_COMPRESSION_GZIP)
+		streamer = astreamer_gzip_decompressor_new(streamer);
+	else if (compression == PG_COMPRESSION_LZ4)
+		streamer = astreamer_lz4_decompressor_new(streamer);
+	else if (compression == PG_COMPRESSION_ZSTD)
+		streamer = astreamer_zstd_decompressor_new(streamer);
+
+	private->archive_streamer = streamer;
+}
+
+/*
+ * Initializes the archive reader for a tar file.
+ */
+static void
+init_tar_archive_reader(XLogDumpPrivate *private, char *waldir,
+						pg_compress_algorithm compression)
+{
+	int			fd;
+
+	/* Now, the tar archive and store its file descriptor */
+	fd = open_file_in_directory(waldir, private->archive_name);
+
+	if (fd < 0)
+		pg_fatal("could not open file \"%s\"", private->archive_name);
+
+	private->archive_fd = fd;
+
+	/* Setup tar archive reading facility */
+	setup_astreamer(private, compression, private->startptr, private->endptr);
+}
+
+/*
+ * Release the archive streamer chain and close the archive file.
+ */
+static void
+free_tar_archive_reader(XLogDumpPrivate *private)
+{
+	/*
+	 * NB: Normally, astreamer_finalize() is called before astreamer_free() to
+	 * flush any remaining buffered data or to ensure the end of the tar
+	 * archive is reached. However, when decoding a WAL file, once we hit the
+	 * end LSN, any remaining WAL data in the buffer or the tar archive's
+	 * unreached end can be safely ignored.
+	 */
+	astreamer_free(private->archive_streamer);
+
+	/* Close the file. */
+	if (close(private->archive_fd) != 0)
+		pg_log_error("could not close file \"%s\": %m",
+					 private->archive_name);
+}
+
+/*
+ * Reads a WAL page from the archive and verifies WAL segment size.
+ */
+static void
+verify_tar_archive(XLogDumpPrivate *private, const char *waldir,
+				   pg_compress_algorithm compression)
+{
+	PGAlignedXLogBlock buf;
+	int			r;
+
+	setup_astreamer(private, compression, InvalidXLogRecPtr, InvalidXLogRecPtr);
+
+	/* Now, the tar archive and store its file descriptor */
+	private->archive_fd = open_file_in_directory(waldir, private->archive_name);
+
+	if (private->archive_fd < 0)
+		pg_fatal("could not open file \"%s\"", private->archive_name);
+
+	/* Read a wal page */
+	r = astreamer_wal_read(buf.data, InvalidXLogRecPtr, XLOG_BLCKSZ, private);
+
+	/* Set WalSegSz if WAL data is successfully read */
+	if (r == XLOG_BLCKSZ)
+	{
+		XLogLongPageHeader longhdr = (XLogLongPageHeader) buf.data;
+
+		WalSegSz = longhdr->xlp_seg_size;
+
+		if (!IsValidWalSegSize(WalSegSz))
+		{
+			pg_log_error(ngettext("invalid WAL segment size in WAL file \"%s\" (%d byte)",
+								  "invalid WAL segment size in WAL file \"%s\" (%d bytes)",
+								  WalSegSz),
+						 private->archive_name, WalSegSz);
+			pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB.");
+			exit(1);
+		}
+	}
+	else
+		pg_fatal("could not read WAL data from \"%s\" archive: read %d of %d",
+				 private->archive_name, r, XLOG_BLCKSZ);
+
+	free_tar_archive_reader(private);
+}
+
 /* Returns the size in bytes of the data to be read. */
 static inline int
 required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr,
@@ -406,7 +559,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
 				XLogRecPtr targetPtr, char *readBuff)
 {
 	XLogDumpPrivate *private = state->private_data;
-	int			count = required_read_len(private, targetPagePtr, reqLen);
+	int			count = required_read_len(private, targetPtr, reqLen);
 	WALReadError errinfo;
 
 	if (private->endptr_reached)
@@ -436,6 +589,44 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
 	return count;
 }
 
+/*
+ * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL
+ * files from tar archives.
+ */
+static void
+TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo,
+					  TimeLineID *tli_p)
+{
+	/* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->segment_close callback.
+ */
+static void
+TarWALDumpCloseSegment(XLogReaderState *state)
+{
+	/* No action needed */
+}
+
+/*
+ * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL
+ * files from tar archives.
+ */
+static int
+TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
+				   XLogRecPtr targetPtr, char *readBuff)
+{
+	XLogDumpPrivate *private = state->private_data;
+	int			count = required_read_len(private, targetPtr, reqLen);
+
+	if (private->endptr_reached)
+		return -1;
+
+	/* Read the WAL page from the archive streamer */
+	return astreamer_wal_read(readBuff, targetPagePtr, count, private);
+}
+
 /*
  * Boolean to return whether the given WAL record matches a specific relation
  * and optionally block.
@@ -773,8 +964,8 @@ usage(void)
 	printf(_("  -F, --fork=FORK        only show records that modify blocks in fork FORK;\n"
 			 "                         valid names are main, fsm, vm, init\n"));
 	printf(_("  -n, --limit=N          number of records to display\n"));
-	printf(_("  -p, --path=PATH        directory in which to find WAL segment files or a\n"
-			 "                         directory with a ./pg_wal that contains such files\n"
+	printf(_("  -p, --path=PATH        tar archive or a directory in which to find WAL segment files or\n"
+			 "                         a directory with a ./pg_wal that contains such files\n"
 			 "                         (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n"));
 	printf(_("  -q, --quiet            do not print any output, except for errors\n"));
 	printf(_("  -r, --rmgr=RMGR        only show records generated by resource manager RMGR;\n"
@@ -806,7 +997,11 @@ main(int argc, char **argv)
 	XLogRecord *record;
 	XLogRecPtr	first_record;
 	char	   *waldir = NULL;
+	char	   *walpath = NULL;
 	char	   *errormsg;
+	bool		is_tar = false;
+	XLogReaderRoutine *routine = NULL;
+	pg_compress_algorithm compression;
 
 	static struct option long_options[] = {
 		{"bkp-details", no_argument, NULL, 'b'},
@@ -938,7 +1133,7 @@ main(int argc, char **argv)
 				}
 				break;
 			case 'p':
-				waldir = pg_strdup(optarg);
+				walpath = pg_strdup(optarg);
 				break;
 			case 'q':
 				config.quiet = true;
@@ -1102,10 +1297,20 @@ main(int argc, char **argv)
 		goto bad_argument;
 	}
 
-	if (waldir != NULL)
+	if (walpath != NULL)
 	{
+		/* validate path points to tar archive */
+		if (is_tar_file(walpath, &compression))
+		{
+			char	   *fname = NULL;
+
+			split_path(walpath, &waldir, &fname);
+
+			private.archive_name = fname;
+			is_tar = true;
+		}
 		/* validate path points to directory */
-		if (!verify_directory(waldir))
+		else if (!verify_directory(walpath))
 		{
 			pg_log_error("could not open directory \"%s\": %m", waldir);
 			goto bad_argument;
@@ -1129,44 +1334,23 @@ main(int argc, char **argv)
 
 		split_path(argv[optind], &directory, &fname);
 
-		if (waldir == NULL && directory != NULL)
+		if (walpath == NULL && directory != NULL)
 		{
-			waldir = directory;
+			walpath = directory;
 
-			if (!verify_directory(waldir))
+			if (!verify_directory(walpath))
 				pg_fatal("could not open directory \"%s\": %m", waldir);
 		}
 
-		waldir = identify_target_directory(waldir, fname);
-		fd = open_file_in_directory(waldir, fname);
-		if (fd < 0)
-			pg_fatal("could not open file \"%s\"", fname);
-		close(fd);
-
-		/* parse position from file */
-		XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
-
-		if (XLogRecPtrIsInvalid(private.startptr))
-			XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
-		else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+		if (fname != NULL && is_tar_file(fname, &compression))
 		{
-			pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
-						 LSN_FORMAT_ARGS(private.startptr),
-						 fname);
-			goto bad_argument;
+			private.archive_name = fname;
+			waldir = walpath;
+			is_tar = true;
 		}
-
-		/* no second file specified, set end position */
-		if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
-			XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
-
-		/* parse ENDSEG if passed */
-		if (optind + 1 < argc)
+		else
 		{
-			XLogSegNo	endsegno;
-
-			/* ignore directory, already have that */
-			split_path(argv[optind + 1], &directory, &fname);
+			waldir = identify_target_directory(walpath, fname);
 
 			fd = open_file_in_directory(waldir, fname);
 			if (fd < 0)
@@ -1174,32 +1358,67 @@ main(int argc, char **argv)
 			close(fd);
 
 			/* parse position from file */
-			XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+			XLogFromFileName(fname, &private.timeline, &segno, WalSegSz);
 
-			if (endsegno < segno)
-				pg_fatal("ENDSEG %s is before STARTSEG %s",
-						 argv[optind + 1], argv[optind]);
+			if (XLogRecPtrIsInvalid(private.startptr))
+				XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr);
+			else if (!XLByteInSeg(private.startptr, segno, WalSegSz))
+			{
+				pg_log_error("start WAL location %X/%08X is not inside file \"%s\"",
+							 LSN_FORMAT_ARGS(private.startptr),
+							 fname);
+				goto bad_argument;
+			}
 
-			if (XLogRecPtrIsInvalid(private.endptr))
-				XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
-										private.endptr);
+			/* no second file specified, set end position */
+			if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr))
+				XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr);
 
-			/* set segno to endsegno for check of --end */
-			segno = endsegno;
-		}
+			/* parse ENDSEG if passed */
+			if (optind + 1 < argc)
+			{
+				XLogSegNo	endsegno;
 
+				/* ignore directory, already have that */
+				split_path(argv[optind + 1], &directory, &fname);
 
-		if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
-			private.endptr != (segno + 1) * WalSegSz)
-		{
-			pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
-						 LSN_FORMAT_ARGS(private.endptr),
-						 argv[argc - 1]);
-			goto bad_argument;
+				fd = open_file_in_directory(waldir, fname);
+				if (fd < 0)
+					pg_fatal("could not open file \"%s\"", fname);
+				close(fd);
+
+				/* parse position from file */
+				XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz);
+
+				if (endsegno < segno)
+					pg_fatal("ENDSEG %s is before STARTSEG %s",
+							 argv[optind + 1], argv[optind]);
+
+				if (XLogRecPtrIsInvalid(private.endptr))
+					XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz,
+											private.endptr);
+
+				/* set segno to endsegno for check of --end */
+				segno = endsegno;
+			}
+
+
+			if (!XLByteInSeg(private.endptr, segno, WalSegSz) &&
+				private.endptr != (segno + 1) * WalSegSz)
+			{
+				pg_log_error("end WAL location %X/%08X is not inside file \"%s\"",
+							 LSN_FORMAT_ARGS(private.endptr),
+							 argv[argc - 1]);
+				goto bad_argument;
+			}
 		}
 	}
-	else
-		waldir = identify_target_directory(waldir, NULL);
+	else if (!is_tar)
+		waldir = identify_target_directory(walpath, NULL);
+
+	/* Verify that the archive contains valid WAL files */
+	if (is_tar)
+		verify_tar_archive(&private, waldir, compression);
 
 	/* we don't know what to print */
 	if (XLogRecPtrIsInvalid(private.startptr))
@@ -1211,11 +1430,26 @@ main(int argc, char **argv)
 	/* done with argument parsing, do the actual work */
 
 	/* we have everything we need, start reading */
+	if (is_tar)
+	{
+		/* Set up for reading tar file */
+		init_tar_archive_reader(&private, waldir, compression);
+
+		/* Routine to decode WAL files in tar archive */
+		routine = XL_ROUTINE(.page_read = TarWALDumpReadPage,
+							 .segment_open = TarWALDumpOpenSegment,
+							 .segment_close = TarWALDumpCloseSegment);
+	}
+	else
+	{
+		/* Routine to decode WAL files */
+		routine = XL_ROUTINE(.page_read = WALDumpReadPage,
+							 .segment_open = WALDumpOpenSegment,
+							 .segment_close = WALDumpCloseSegment);
+	}
+
 	xlogreader_state =
-		XLogReaderAllocate(WalSegSz, waldir,
-						   XL_ROUTINE(.page_read = WALDumpReadPage,
-									  .segment_open = WALDumpOpenSegment,
-									  .segment_close = WALDumpCloseSegment),
+		XLogReaderAllocate(WalSegSz, waldir, routine,
 						   &private);
 	if (!xlogreader_state)
 		pg_fatal("out of memory while allocating a WAL reading processor");
@@ -1325,6 +1559,9 @@ main(int argc, char **argv)
 
 	XLogReaderFree(xlogreader_state);
 
+	if (is_tar)
+		free_tar_archive_reader(&private);
+
 	return EXIT_SUCCESS;
 
 bad_argument:
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index cd9a36d7447..d2c2307d6c2 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -12,6 +12,8 @@
 #define PG_WALDUMP_H
 
 #include "access/xlogdefs.h"
+#include "fe_utils/astreamer.h"
+#include "lib/stringinfo.h"
 
 extern int WalSegSz;
 
@@ -22,6 +24,23 @@ typedef struct XLogDumpPrivate
 	XLogRecPtr	startptr;
 	XLogRecPtr	endptr;
 	bool		endptr_reached;
+
+	/* Fields required to read WAL from archive */
+	char	   *archive_name;	/* Tar archive name */
+	int			archive_fd;		/* File descriptor for the open tar file */
+
+	astreamer  *archive_streamer;
+	StringInfo	archive_streamer_buf;	/* Buffer for receiving WAL data */
+	XLogRecPtr	archive_streamer_read_ptr; /* Populate the buffer with records
+											  until this record pointer */
 } XLogDumpPrivate;
 
-#endif		/* end of PG_WALDUMP_H */
+
+extern astreamer *astreamer_waldump_content_new(astreamer *next,
+												XLogRecPtr startptr,
+												XLogRecPtr endptr,
+												XLogDumpPrivate *privateInfo);
+extern int	astreamer_wal_read(char *readBuff, XLogRecPtr startptr, Size count,
+							   XLogDumpPrivate *privateInfo);
+
+#endif							/* end of PG_WALDUMP_H */
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 1b712e8d74d..80298d2a51d 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -3,10 +3,13 @@
 
 use strict;
 use warnings FATAL => 'all';
+use Cwd;
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
 
+my $tar = $ENV{TAR};
+
 program_help_ok('pg_waldump');
 program_version_ok('pg_waldump');
 program_options_handling_ok('pg_waldump');
@@ -235,7 +238,7 @@ command_like(
 sub test_pg_waldump
 {
 	local $Test::Builder::Level = $Test::Builder::Level + 1;
-	my @opts = @_;
+	my ($path, @opts) = @_;
 
 	my ($stdout, $stderr);
 
@@ -243,6 +246,7 @@ sub test_pg_waldump
 		'pg_waldump',
 		'--start' => $start_lsn,
 		'--end' => $end_lsn,
+		'--path' => $path,
 		@opts
 	  ],
 	  '>' => \$stdout,
@@ -254,11 +258,27 @@ sub test_pg_waldump
 	return @lines;
 }
 
-my @lines;
+my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
 
 my @scenario = (
 	{
-		'path' => $node->data_dir
+		'path' => $node->data_dir,
+		'is_archive' => 0,
+		'enabled' => 1
+	},
+	{
+		'path' => "$tmp_dir/pg_wal.tar",
+		'compression_method' => 'none',
+		'compression_flags' => '-cf',
+		'is_archive' => 1,
+		'enabled' => 1
+	},
+	{
+		'path' => "$tmp_dir/pg_wal.tar.gz",
+		'compression_method' => 'gzip',
+		'compression_flags' => '-czf',
+		'is_archive' => 1,
+		'enabled' => check_pg_config("#define HAVE_LIBZ 1")
 	});
 
 for my $scenario (@scenario)
@@ -267,6 +287,22 @@ for my $scenario (@scenario)
 
 	SKIP:
 	{
+		skip "tar command is not available", 3
+		  if !defined $tar;
+		skip "$scenario->{'compression_method'} compression not supported by this build", 3
+		  if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+
+		  # create pg_wal archive
+		  if ($scenario->{'is_archive'})
+		  {
+			  # move into the WAL directory before archiving files
+			  my $cwd = getcwd;
+			  chdir($node->data_dir . '/pg_wal/') || die "chdir: $!";
+			  command_ok(
+				  [ $tar, $scenario->{'compression_flags'}, $path , '.' ]);
+			  chdir($cwd) || die "chdir: $!";
+		  }
+
 		command_fails_like(
 			[ 'pg_waldump', '--path' => $path ],
 			qr/error: no start WAL location given/,
@@ -298,38 +334,42 @@ for my $scenario (@scenario)
 			qr/error: error in WAL record at/,
 			'errors are shown with --quiet');
 
-		@lines = test_pg_waldump('--path' => $path);
+		my @lines;
+		@lines = test_pg_waldump($path);
 		is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines');
 
-		@lines = test_pg_waldump('--path' => $path, '--limit' => 6);
+		@lines = test_pg_waldump($path, '--limit' => 6);
 		is(@lines, 6, 'limit option observed');
 
-		@lines = test_pg_waldump('--path' => $path, '--fullpage');
+		@lines = test_pg_waldump($path, '--fullpage');
 		is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW');
 
-		@lines = test_pg_waldump('--path' => $path, '--stats');
+		@lines = test_pg_waldump($path, '--stats');
 		like($lines[0], qr/WAL statistics/, "statistics on stdout");
 		is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
 
-		@lines = test_pg_waldump('--path' => $path, '--stats=record');
+		@lines = test_pg_waldump($path, '--stats=record');
 		like($lines[0], qr/WAL statistics/, "statistics on stdout");
 		is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output');
 
-		@lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree');
+		@lines = test_pg_waldump($path, '--rmgr' => 'Btree');
 		is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines');
 
-		@lines = test_pg_waldump('--path' => $path, '--fork' => 'init');
+		@lines = test_pg_waldump($path, '--fork' => 'init');
 		is(grep(!/fork init/, @lines), 0, 'only init fork lines');
 
-		@lines = test_pg_waldump('--path' => $path,
+		@lines = test_pg_waldump($path,
 			'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid");
 		is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines),
 			0, 'only lines for selected relation');
 
-		@lines = test_pg_waldump('--path' => $path,
+		@lines = test_pg_waldump($path,
 			'--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid",
 			'--block' => 1);
 		is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block');
+
+		# Cleanup.
+		unlink $path if $scenario->{'is_archive'};
 	}
 }
 
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e6f2e93b2d6..d8428ce2352 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -3445,6 +3445,7 @@ astreamer_recovery_injector
 astreamer_tar_archiver
 astreamer_tar_parser
 astreamer_verify
+astreamer_waldump
 astreamer_zstd_frame
 auth_password_hook_typ
 autovac_table
-- 
2.47.1

From 7469b7b6bf3dd84d092fd86f69bf5ab574ee4f85 Mon Sep 17 00:00:00 2001
From: Amul Sul <sulamul@gmail.com>
Date: Thu, 7 Aug 2025 17:37:23 +0530
Subject: [PATCH v1 6/9] WIP-pg_waldump: Remove the restriction on the order of
 archived WAL files.

With previous patch, pg_waldump would stop decoding if WAL files were
not in the required sequence. With this patch, decoding will now
continue.  Any WAL file that is out of order will be written to a
temporary location, from which it will be read later. Once a temporary
file has been read, it will be removed.

TODO:
  Timeline switching is not handled correctly, especially when a
  timeline change occurs on the next WAL file that was previously
  written to a temporary location.
---
 doc/src/sgml/ref/pg_waldump.sgml       |   8 +-
 src/bin/pg_waldump/astreamer_waldump.c | 188 +++++++++++++++++++++----
 src/bin/pg_waldump/pg_waldump.c        |  99 ++++++++++++-
 src/bin/pg_waldump/pg_waldump.h        |   1 +
 src/bin/pg_waldump/t/001_basic.pl      |  40 +++++-
 5 files changed, 301 insertions(+), 35 deletions(-)

diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d004bb0f67e..8a28b4f0f91 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -149,8 +149,12 @@ PostgreSQL documentation
         of <envar>PGDATA</envar>.
        </para>
        <para>
-        If a tar archive is provided, its WAL segment files must be in
-        sequential order; otherwise, an error will be reported.
+        If a tar archive is provided and its WAL segment files are not in
+        sequential order, those files will be written to a temporary directory
+        named <filename>pg_waldump_tmp_dir/</filename>. This directory will be
+        created inside the directory specified by the <envar>TMPDIR</envar>
+        environment variable if it is set; otherwise, it will be created within
+        the same directory as the tar archive.
        </para>
       </listitem>
      </varlistentry>
diff --git a/src/bin/pg_waldump/astreamer_waldump.c b/src/bin/pg_waldump/astreamer_waldump.c
index d0ac903c54e..a088c33b16f 100644
--- a/src/bin/pg_waldump/astreamer_waldump.c
+++ b/src/bin/pg_waldump/astreamer_waldump.c
@@ -18,6 +18,7 @@
 
 #include "access/xlog_internal.h"
 #include "access/xlogdefs.h"
+#include "common/file_perm.h"
 #include "common/logging.h"
 #include "fe_utils/simple_list.h"
 #include "pg_waldump.h"
@@ -37,6 +38,9 @@ typedef struct astreamer_waldump
 
 	/* These fields change with archive member. */
 	bool		skipThisSeg;
+	bool		writeThisSeg;
+	FILE	   *segFp;
+	SimpleStringList exportedSegList;	/* Temporary exported segment list */
 	XLogSegNo	nextSegNo;		/* Next expected segment to stream */
 } astreamer_waldump;
 
@@ -53,8 +57,11 @@ static bool member_is_relevant_wal(astreamer_member *member,
 								   XLogSegNo startSegNo,
 								   XLogSegNo endSegNo,
 								   XLogSegNo nextSegNo,
+								   char **curFname,
 								   XLogSegNo *curSegNo,
 								   TimeLineID *curSegTimeline);
+static bool member_needs_temp_write(astreamer_waldump *mystreamer,
+									const char *fname);
 
 static const astreamer_ops astreamer_waldump_ops = {
 	.content = astreamer_waldump_content,
@@ -189,17 +196,8 @@ astreamer_waldump_content_new(astreamer *next, XLogRecPtr startptr,
 	if (XLogRecPtrIsInvalid(startptr))
 		streamer->startSegNo = 0;
 	else
-	{
 		XLByteToSeg(startptr, streamer->startSegNo, WalSegSz);
 
-		/*
-		 * Initialize the record pointer to the beginning of the first
-		 * segment; this pointer will track the WAL record reading status.
-		 */
-		XLogSegNoOffsetToRecPtr(streamer->startSegNo, 0, WalSegSz,
-								privateInfo->archive_streamer_read_ptr);
-	}
-
 	if (XLogRecPtrIsInvalid(endPtr))
 		streamer->endSegNo = UINT64_MAX;
 	else
@@ -228,19 +226,21 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
 	{
 		case ASTREAMER_MEMBER_HEADER:
 			{
+				char	   *fname;
 				XLogSegNo	segNo;
 				TimeLineID	timeline;
 
 				pg_log_debug("pg_waldump: reading \"%s\"", member->pathname);
 
 				mystreamer->skipThisSeg = false;
+				mystreamer->writeThisSeg = false;
 
 				if (!member_is_relevant_wal(member,
 											privateInfo->timeline,
 											mystreamer->startSegNo,
 											mystreamer->endSegNo,
 											mystreamer->nextSegNo,
-											&segNo, &timeline))
+											&fname, &segNo, &timeline))
 				{
 					mystreamer->skipThisSeg = true;
 					break;
@@ -254,24 +254,37 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
 				if (mystreamer->nextSegNo == 0)
 					break;
 
-				/* WAL segments must be archived in order */
-				if (mystreamer->nextSegNo != segNo)
+				/*
+				 * When WAL segments are not archived sequentially, it becomes
+				 * necessary to write out (or preserve) segments that might be
+				 * required at a later point.
+				 */
+				if (mystreamer->nextSegNo != segNo &&
+					member_needs_temp_write(mystreamer, fname))
 				{
-					pg_log_error("WAL files are not archived in sequential order");
-					pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".",
-										mystreamer->nextSegNo, segNo);
-					exit(1);
+					mystreamer->writeThisSeg = true;
+					break;
 				}
 
 				/*
-				 * We track the reading of WAL segment records using a pointer
-				 * that's continuously incremented by the length of the
-				 * received data. This pointer is crucial for serving WAL page
-				 * requests from the WAL decoding routine, so it must be
-				 * accurate.
+				 * We are now streaming segment containt.
+				 *
+				 * We need to track the reading of WAL segment records using a
+				 * pointer that's typically incremented by the length of the
+				 * data read. However, we sometimes export the WAL file to
+				 * temporary storage, allowing the decoding routine to read
+				 * directly from there. This makes continuous pointer
+				 * incrementing challenging, as file reads can occur from any
+				 * offset, leading to potential errors. Therefore, we now
+				 * reset the pointer when reading from a file for streaming.
+				 * Also, if there's any existing data in the buffer, the next
+				 * WAL record should logically follow it.
 				 */
 #ifdef USE_ASSERT_CHECKING
-				if (mystreamer->nextSegNo != 0)
+				Assert(!mystreamer->skipThisSeg);
+				Assert(!mystreamer->writeThisSeg);
+
+				if (privateInfo->archive_streamer_buf->len != 0)
 				{
 					XLogRecPtr	recPtr;
 
@@ -280,6 +293,13 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
 				}
 #endif
 
+				/*
+				 * Initialized to the beginning of the current segment being
+				 * streamed through the buffer.
+				 */
+				XLogSegNoOffsetToRecPtr(segNo, 0, WalSegSz,
+										privateInfo->archive_streamer_read_ptr);
+
 				/* Save the timeline */
 				privateInfo->timeline = timeline;
 
@@ -293,12 +313,44 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member,
 			if (mystreamer->skipThisSeg)
 				break;
 
+			/* Or, write contents to file */
+			if (mystreamer->writeThisSeg)
+			{
+				Assert(mystreamer->segFp != NULL);
+
+				errno = 0;
+				if (len > 0 && fwrite(data, len, 1, mystreamer->segFp) != 1)
+				{
+					char	   *fname;
+					int			pathlen = strlen(member->pathname);
+
+					Assert(pathlen >= XLOG_FNAME_LEN);
+
+					fname = member->pathname + (pathlen - XLOG_FNAME_LEN);
+
+					/*
+					 * If write didn't set errno, assume problem is no disk
+					 * space
+					 */
+					if (errno == 0)
+						errno = ENOSPC;
+					pg_fatal("could not write to file \"%s/%s\": %m",
+							 privateInfo->tmpdir, fname);
+				}
+				break;
+			}
+
 			/* Or, copy contents to buffer */
 			privateInfo->archive_streamer_read_ptr += len;
 			astreamer_buffer_bytes(streamer, &data, &len, len);
 			break;
 
 		case ASTREAMER_MEMBER_TRAILER:
+			if (mystreamer->segFp != NULL)
+			{
+				fclose(mystreamer->segFp);
+				mystreamer->segFp = NULL;
+			}
 			break;
 
 		case ASTREAMER_ARCHIVE_TRAILER:
@@ -325,8 +377,14 @@ astreamer_waldump_finalize(astreamer *streamer)
 static void
 astreamer_waldump_free(astreamer *streamer)
 {
+	astreamer_waldump *mystreamer;
+
 	Assert(streamer->bbs_next == NULL);
 
+	mystreamer = (astreamer_waldump *) streamer;
+	if (mystreamer->segFp != NULL)
+		fclose(mystreamer->segFp);
+
 	pfree(streamer->bbs_buffer.data);
 	pfree(streamer);
 }
@@ -339,8 +397,8 @@ astreamer_waldump_free(astreamer *streamer)
 static bool
 member_is_relevant_wal(astreamer_member *member, TimeLineID startTimeLineID,
 					   XLogSegNo startSegNo, XLogSegNo endSegNo,
-					   XLogSegNo nextSegNo, XLogSegNo *curSegNo,
-					   TimeLineID *curSegTimeline)
+					   XLogSegNo nextSegNo, char **curFname,
+					   XLogSegNo *curSegNo, TimeLineID *curSegTimeline)
 {
 	int			pathlen;
 	XLogSegNo	segNo;
@@ -371,8 +429,90 @@ member_is_relevant_wal(astreamer_member *member, TimeLineID startTimeLineID,
 	if (startSegNo > segNo || endSegNo < segNo)
 		return false;
 
+	/*
+	 * A corner case where we've already streamed the contents of an archived
+	 * WAL segment with a similar name, so ignoring this duplicate.
+	 */
+	if (nextSegNo > segNo)
+		return false;
+
+	*curFname = fname;
 	*curSegNo = segNo;
 	*curSegTimeline = timeline;
 
 	return true;
 }
+
+/*
+ * Returns true and creates a temporary file if the given WAL segment needs to
+ * be written to temporary space. This is required when the segment is not the
+ * one currently being decoded. Conversely, if a temporary file for the
+ * preceding segment already exists and the current segment is its direct
+ * successor, then writing to temporary space is not necessary, and false is
+ * returned.
+ */
+static bool
+member_needs_temp_write(astreamer_waldump *mystreamer, const char *fname)
+{
+	bool		exists;
+	XLogSegNo	segNo;
+	TimeLineID	timeline;
+	XLogDumpPrivate *privateInfo = mystreamer->privateInfo;
+
+	/* Parse position from file */
+	XLogFromFileName(fname, &timeline, &segNo, WalSegSz);
+
+	/*
+	 * If we find a file that was previously written to the temporary space,
+	 * it indicates that the corresponding WAL segment request has already
+	 * been fulfilled. In that case, we increment the nextSegNo counter and
+	 * check again whether the current segment number matches the required WAL
+	 * segment (i.e. nextSegNo). If it does, we allow it to stream normally
+	 * through the buffer. Otherwise, we write it to the temporary space, from
+	 * where the caller is expected to read it directly.
+	 */
+	do
+	{
+		char		segName[MAXFNAMELEN];
+
+		XLogFileName(segName, timeline, mystreamer->nextSegNo, WalSegSz);
+
+		/*
+		 * If the WAL segment has already been exported, increment the counter
+		 * and check for the next segment.
+		 */
+		exists = false;
+		if (simple_string_list_member(&mystreamer->exportedSegList, segName))
+		{
+			mystreamer->nextSegNo += 1;
+			exists = true;
+		}
+	} while (exists);
+
+	/*
+	 * Need to export this segment to disk; create an empty placeholder file
+	 * to be written once its content is received.
+	 */
+	if (mystreamer->nextSegNo != segNo)
+	{
+		char		fpath[MAXPGPATH];
+
+		snprintf(fpath, MAXPGPATH, "%s/%s", privateInfo->tmpdir, fname);
+
+		mystreamer->segFp = fopen(fpath, PG_BINARY_W);
+		if (mystreamer->segFp == NULL)
+			pg_fatal("could not create file \"%s\": %m", fpath);
+
+#ifndef WIN32
+		if (chmod(fpath, pg_file_create_mode))
+			pg_fatal("could not set permissions on file \"%s\": %m",
+					 fpath);
+#endif
+
+		/* Record this segment's export to temporary space */
+		simple_string_list_append(&mystreamer->exportedSegList, fname);
+		return true;
+	}
+
+	return false;
+}
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 64f3a65b735..54a3b2dacda 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -325,6 +325,51 @@ identify_target_directory(char *directory, char *fname)
 	return NULL;				/* not reached */
 }
 
+/*
+ * Set up a temporary directory to temporarily store WAL segments.
+ */
+static char *
+setup_tmp_dir(char *waldir)
+{
+	char	   *tmpdir = waldir != NULL ? pstrdup(waldir) :  pstrdup(".");
+
+	canonicalize_path(tmpdir);
+	tmpdir = psprintf("%s/pg_waldump_tmp_dir",
+					  getenv("TMPDIR") ? getenv("TMPDIR") : tmpdir);
+
+	create_directory(tmpdir);
+
+	return tmpdir;
+}
+
+/*
+ * Removes a directory along with its contents, if any.
+ */
+static void
+remove_tmp_dir(char *tmpdir)
+{
+	DIR		   *dir;
+	struct dirent *de;
+
+	dir = opendir(tmpdir);
+	while ((de = readdir(dir)) != NULL)
+	{
+		char		path[MAXPGPATH];
+
+		if (strcmp(de->d_name, ".") == 0 ||
+			strcmp(de->d_name, "..") == 0)
+			continue;
+
+		snprintf(path, MAXPGPATH, "%s/%s", tmpdir, de->d_name);
+		unlink(path);
+	}
+	closedir(dir);
+
+	if (rmdir(tmpdir) < 0)
+		pg_log_error("could not remove directory \"%s\": %m",
+					 tmpdir);
+}
+
 /*
  * Returns true if the given file is a tar archive and outputs its compression
  * algorithm.
@@ -559,7 +604,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
 				XLogRecPtr targetPtr, char *readBuff)
 {
 	XLogDumpPrivate *private = state->private_data;
-	int			count = required_read_len(private, targetPtr, reqLen);
+	int			count = required_read_len(private, targetPagePtr, reqLen);
 	WALReadError errinfo;
 
 	if (private->endptr_reached)
@@ -618,12 +663,53 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen,
 				   XLogRecPtr targetPtr, char *readBuff)
 {
 	XLogDumpPrivate *private = state->private_data;
-	int			count = required_read_len(private, targetPtr, reqLen);
+	int			count = required_read_len(private, targetPagePtr, reqLen);
+	XLogSegNo	nextSegNo;
 
 	if (private->endptr_reached)
 		return -1;
 
-	/* Read the WAL page from the archive streamer */
+	/*
+	 * If the target page is in a different segment, first check for the WAL
+	 * segment's physical existence in the temporary directory.
+	 *
+	 * XXX: Timeline change is not handled.
+	 */
+	nextSegNo = state->seg.ws_segno;
+	if (!XLByteInSeg(targetPagePtr, nextSegNo, WalSegSz))
+	{
+		char		fname[MAXPGPATH];
+
+		if (state->seg.ws_file >= 0)
+		{
+			char		fpath[MAXPGPATH];
+
+			close(state->seg.ws_file);
+			state->seg.ws_file = -1;
+
+			/* Remove this file, as it is no longer needed. */
+			XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz);
+			snprintf(fpath, MAXPGPATH, "%s/%s", private->tmpdir, fname);
+			unlink(fpath);
+		}
+
+		XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz);
+		state->seg.ws_tli = private->timeline;
+		state->seg.ws_segno = nextSegNo;
+
+		/*
+		 * If the next segment exists, open it and continue reading from there
+		 */
+		XLogFileName(fname, private->timeline, nextSegNo, WalSegSz);
+		state->seg.ws_file = open_file_in_directory(private->tmpdir, fname);
+	}
+
+	/* Continue reading from the open WAL segment, if any */
+	if (state->seg.ws_file >= 0)
+		return WALDumpReadPage(state, targetPagePtr, reqLen, targetPtr,
+							   readBuff);
+
+	/* Otherwise, read the WAL page from the archive streamer */
 	return astreamer_wal_read(readBuff, targetPagePtr, count, private);
 }
 
@@ -1435,6 +1521,9 @@ main(int argc, char **argv)
 		/* Set up for reading tar file */
 		init_tar_archive_reader(&private, waldir, compression);
 
+		/* Create temporary space for writing WAL segments. */
+		private.tmpdir = setup_tmp_dir(waldir);
+
 		/* Routine to decode WAL files in tar archive */
 		routine = XL_ROUTINE(.page_read = TarWALDumpReadPage,
 							 .segment_open = TarWALDumpOpenSegment,
@@ -1549,6 +1638,10 @@ main(int argc, char **argv)
 	if (config.stats == true && !config.quiet)
 		XLogDumpDisplayStats(&config, &stats);
 
+	/* Remove temporary directory if any */
+	if (private.tmpdir != NULL)
+		remove_tmp_dir(private.tmpdir);
+
 	if (time_to_stop)
 		exit(0);
 
diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h
index d2c2307d6c2..2644d847b47 100644
--- a/src/bin/pg_waldump/pg_waldump.h
+++ b/src/bin/pg_waldump/pg_waldump.h
@@ -33,6 +33,7 @@ typedef struct XLogDumpPrivate
 	StringInfo	archive_streamer_buf;	/* Buffer for receiving WAL data */
 	XLogRecPtr	archive_streamer_read_ptr; /* Populate the buffer with records
 											  until this record pointer */
+	char	   *tmpdir;
 } XLogDumpPrivate;
 
 
diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl
index 80298d2a51d..a3bf950db97 100644
--- a/src/bin/pg_waldump/t/001_basic.pl
+++ b/src/bin/pg_waldump/t/001_basic.pl
@@ -7,6 +7,8 @@ use Cwd;
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
 use Test::More;
+use File::Path qw(rmtree);
+use List::Util qw(shuffle);
 
 my $tar = $ENV{TAR};
 
@@ -258,6 +260,32 @@ sub test_pg_waldump
 	return @lines;
 }
 
+# Create a tar archive, shuffling the file order
+sub generate_archive
+{
+	my ($archive, $directory, $compression_flags) = @_;
+
+	my @files;
+	opendir my $dh, $directory or die "opendir: $!";
+	while (my $entry = readdir $dh) {
+		# Skip '.' and '..'
+		next if $entry eq '.' || $entry eq '..';
+		push @files, $entry;
+	}
+	closedir $dh;
+
+	@files = shuffle @files;
+
+	# move into the WAL directory before archiving files
+	my $cwd = getcwd;
+	chdir($directory) || die "chdir: $!";
+	command_ok([$tar, $compression_flags, $archive, @files]);
+	chdir($cwd) || die "chdir: $!";
+
+	# give necessary permission
+	chmod(0755, $archive) || die "chmod $archive: $!";
+}
+
 my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short();
 
 my @scenario = (
@@ -291,16 +319,16 @@ for my $scenario (@scenario)
 		  if !defined $tar;
 		skip "$scenario->{'compression_method'} compression not supported by this build", 3
 		  if !$scenario->{'enabled'} && $scenario->{'is_archive'};
+		skip "unix-style permissions not supported on Windows", 3
+		  if ($scenario->{'is_archive'}
+			&& ($windows_os || $Config::Config{osname} eq 'cygwin'));
 
 		  # create pg_wal archive
 		  if ($scenario->{'is_archive'})
 		  {
-			  # move into the WAL directory before archiving files
-			  my $cwd = getcwd;
-			  chdir($node->data_dir . '/pg_wal/') || die "chdir: $!";
-			  command_ok(
-				  [ $tar, $scenario->{'compression_flags'}, $path , '.' ]);
-			  chdir($cwd) || die "chdir: $!";
+			  generate_archive($path,
+				  $node->data_dir . '/pg_wal',
+				  $scenario->{'compression_flags'});
 		  }
 
 		command_fails_like(
-- 
2.47.1

From 10816f545e7f2f3df1fb9075321d2bd81df195d4 Mon Sep 17 00:00:00 2001
From: Amul Sul <sulamul@gmail.com>
Date: Wed, 16 Jul 2025 14:47:43 +0530
Subject: [PATCH v1 7/9] pg_verifybackup: Delay default WAL directory
 preparation.

We are not sure whether to parse WAL from a directory or an archive
until the backup format is known. Therefore, we delay preparing the
default WAL directory until the point of parsing. This delay is
harmless, as the WAL directory is not used elsewhere.
---
 src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 5e6c13bb921..31ebc1581fb 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -285,10 +285,6 @@ main(int argc, char **argv)
 		manifest_path = psprintf("%s/backup_manifest",
 								 context.backup_directory);
 
-	/* By default, look for the WAL in the backup directory, too. */
-	if (wal_directory == NULL)
-		wal_directory = psprintf("%s/pg_wal", context.backup_directory);
-
 	/*
 	 * Try to read the manifest. We treat any errors encountered while parsing
 	 * the manifest as fatal; there doesn't seem to be much point in trying to
@@ -368,6 +364,10 @@ main(int argc, char **argv)
 	if (context.format == 'p' && !context.skip_checksums)
 		verify_backup_checksums(&context);
 
+	/* By default, look for the WAL in the backup directory, too. */
+	if (wal_directory == NULL)
+		wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+
 	/*
 	 * Try to parse the required ranges of WAL records, unless we were told
 	 * not to do so.
-- 
2.47.1

From c8187d4996df271117afb623db6c72d3033d4b06 Mon Sep 17 00:00:00 2001
From: Amul Sul <sulamul@gmail.com>
Date: Thu, 24 Jul 2025 16:37:43 +0530
Subject: [PATCH v1 8/9] pg_verifybackup: Rename the wal-directory switch to
 wal-path

Future patches to pg_waldump will enable it to decode WAL directly
from tar files. This means you'll be able to specify a tar archive
path instead of a traditional WAL directory.

To keep things consistent and more versatile, we should also
generalize the input switch for pg_verifybackup. It should accept
either a directory or a tar file path that contains WALs. This change
will also aligning it with the existing manifest-path switch naming.
---
 doc/src/sgml/ref/pg_verifybackup.sgml     |  2 +-
 src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++-----------
 src/bin/pg_verifybackup/po/de.po          |  4 ++--
 src/bin/pg_verifybackup/po/el.po          |  4 ++--
 src/bin/pg_verifybackup/po/es.po          |  4 ++--
 src/bin/pg_verifybackup/po/fr.po          |  4 ++--
 src/bin/pg_verifybackup/po/it.po          |  4 ++--
 src/bin/pg_verifybackup/po/ja.po          |  4 ++--
 src/bin/pg_verifybackup/po/ka.po          |  4 ++--
 src/bin/pg_verifybackup/po/ko.po          |  4 ++--
 src/bin/pg_verifybackup/po/ru.po          |  4 ++--
 src/bin/pg_verifybackup/po/sv.po          |  4 ++--
 src/bin/pg_verifybackup/po/uk.po          |  4 ++--
 src/bin/pg_verifybackup/po/zh_CN.po       |  4 ++--
 src/bin/pg_verifybackup/po/zh_TW.po       |  4 ++--
 src/bin/pg_verifybackup/t/007_wal.pl      |  4 ++--
 16 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index 61c12975e4a..e9b8bfd51b1 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -261,7 +261,7 @@ PostgreSQL documentation
 
      <varlistentry>
       <term><option>-w <replaceable class="parameter">path</replaceable></option></term>
-      <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term>
+      <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term>
       <listitem>
        <para>
         Try to parse WAL files stored in the specified directory, rather than
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 31ebc1581fb..1ee400199da 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context,
 								 uint8 *buffer);
 static void parse_required_wal(verifier_context *context,
 							   char *pg_waldump_path,
-							   char *wal_directory);
+							   char *wal_path);
 static astreamer *create_archive_verifier(verifier_context *context,
 										  char *archive_name,
 										  Oid tblspc_oid,
@@ -126,7 +126,7 @@ main(int argc, char **argv)
 		{"progress", no_argument, NULL, 'P'},
 		{"quiet", no_argument, NULL, 'q'},
 		{"skip-checksums", no_argument, NULL, 's'},
-		{"wal-directory", required_argument, NULL, 'w'},
+		{"wal-path", required_argument, NULL, 'w'},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -135,7 +135,7 @@ main(int argc, char **argv)
 	char	   *manifest_path = NULL;
 	bool		no_parse_wal = false;
 	bool		quiet = false;
-	char	   *wal_directory = NULL;
+	char	   *wal_path = NULL;
 	char	   *pg_waldump_path = NULL;
 	DIR		   *dir;
 
@@ -221,8 +221,8 @@ main(int argc, char **argv)
 				context.skip_checksums = true;
 				break;
 			case 'w':
-				wal_directory = pstrdup(optarg);
-				canonicalize_path(wal_directory);
+				wal_path = pstrdup(optarg);
+				canonicalize_path(wal_path);
 				break;
 			default:
 				/* getopt_long already emitted a complaint */
@@ -365,15 +365,15 @@ main(int argc, char **argv)
 		verify_backup_checksums(&context);
 
 	/* By default, look for the WAL in the backup directory, too. */
-	if (wal_directory == NULL)
-		wal_directory = psprintf("%s/pg_wal", context.backup_directory);
+	if (wal_path == NULL)
+		wal_path = psprintf("%s/pg_wal", context.backup_directory);
 
 	/*
 	 * Try to parse the required ranges of WAL records, unless we were told
 	 * not to do so.
 	 */
 	if (!no_parse_wal)
-		parse_required_wal(&context, pg_waldump_path, wal_directory);
+		parse_required_wal(&context, pg_waldump_path, wal_path);
 
 	/*
 	 * If everything looks OK, tell the user this, unless we were asked to
@@ -1198,7 +1198,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m,
  */
 static void
 parse_required_wal(verifier_context *context, char *pg_waldump_path,
-				   char *wal_directory)
+				   char *wal_path)
 {
 	manifest_data *manifest = context->manifest;
 	manifest_wal_range *this_wal_range = manifest->first_wal_range;
@@ -1208,7 +1208,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path,
 		char	   *pg_waldump_cmd;
 
 		pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n",
-								  pg_waldump_path, wal_directory, this_wal_range->tli,
+								  pg_waldump_path, wal_path, this_wal_range->tli,
 								  LSN_FORMAT_ARGS(this_wal_range->start_lsn),
 								  LSN_FORMAT_ARGS(this_wal_range->end_lsn));
 		fflush(NULL);
@@ -1376,7 +1376,7 @@ usage(void)
 	printf(_("  -P, --progress              show progress information\n"));
 	printf(_("  -q, --quiet                 do not print any output, except for errors\n"));
 	printf(_("  -s, --skip-checksums        skip checksum verification\n"));
-	printf(_("  -w, --wal-directory=PATH    use specified path for WAL files\n"));
+	printf(_("  -w, --wal-path=PATH         use specified path for WAL files\n"));
 	printf(_("  -V, --version               output version information, then exit\n"));
 	printf(_("  -?, --help                  show this help, then exit\n"));
 	printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
diff --git a/src/bin/pg_verifybackup/po/de.po b/src/bin/pg_verifybackup/po/de.po
index a9e24931100..9b5cd5898cf 100644
--- a/src/bin/pg_verifybackup/po/de.po
+++ b/src/bin/pg_verifybackup/po/de.po
@@ -785,8 +785,8 @@ msgstr "  -s, --skip-checksums        Überprüfung der Prüfsummen überspringe
 
 #: pg_verifybackup.c:1379
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PFAD    angegebenen Pfad für WAL-Dateien verwenden\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PFAD    angegebenen Pfad für WAL-Dateien verwenden\n"
 
 #: pg_verifybackup.c:1380
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/el.po b/src/bin/pg_verifybackup/po/el.po
index 3e3f20c67c5..81442f51c17 100644
--- a/src/bin/pg_verifybackup/po/el.po
+++ b/src/bin/pg_verifybackup/po/el.po
@@ -494,8 +494,8 @@ msgstr "  -s, --skip-checksums        παράκαμψε την επαλήθευ
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PATH    χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PATH    χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/es.po b/src/bin/pg_verifybackup/po/es.po
index 0cb958f3448..7f729fa35ba 100644
--- a/src/bin/pg_verifybackup/po/es.po
+++ b/src/bin/pg_verifybackup/po/es.po
@@ -495,8 +495,8 @@ msgstr "  -s, --skip-checksums        omitir la verificación de la suma de comp
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PATH    utilizar la ruta especificada para los archivos WAL\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PATH    utilizar la ruta especificada para los archivos WAL\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/fr.po b/src/bin/pg_verifybackup/po/fr.po
index da8c72f6427..09937966fa7 100644
--- a/src/bin/pg_verifybackup/po/fr.po
+++ b/src/bin/pg_verifybackup/po/fr.po
@@ -498,8 +498,8 @@ msgstr "  -s, --skip-checksums        ignore la vérification des sommes de cont
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=CHEMIN  utilise le chemin spécifié pour les fichiers WAL\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=CHEMIN  utilise le chemin spécifié pour les fichiers WAL\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/it.po b/src/bin/pg_verifybackup/po/it.po
index 317b0b71e7f..4da68d0074e 100644
--- a/src/bin/pg_verifybackup/po/it.po
+++ b/src/bin/pg_verifybackup/po/it.po
@@ -472,8 +472,8 @@ msgstr "  -s, --skip-checksums         salta la verifica del checksum\n"
 
 #: pg_verifybackup.c:911
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PATH     usa il percorso specificato per i file WAL\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PATH     usa il percorso specificato per i file WAL\n"
 
 #: pg_verifybackup.c:912
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/ja.po b/src/bin/pg_verifybackup/po/ja.po
index c910fb236cc..a948959b54f 100644
--- a/src/bin/pg_verifybackup/po/ja.po
+++ b/src/bin/pg_verifybackup/po/ja.po
@@ -672,8 +672,8 @@ msgstr "  -s, --skip-checksums        チェックサム検証をスキップ\n"
 
 #: pg_verifybackup.c:1379
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PATH    WALファイルに指定したパスを使用する\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PATH    WALファイルに指定したパスを使用する\n"
 
 #: pg_verifybackup.c:1380
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/ka.po b/src/bin/pg_verifybackup/po/ka.po
index 982751984c7..ef2799316a8 100644
--- a/src/bin/pg_verifybackup/po/ka.po
+++ b/src/bin/pg_verifybackup/po/ka.po
@@ -784,8 +784,8 @@ msgstr "  -s, --skip-checksums        საკონტროლო ჯამ
 
 #: pg_verifybackup.c:1379
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=ბილიკი    WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=ბილიკი    WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n"
 
 #: pg_verifybackup.c:1380
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/ko.po b/src/bin/pg_verifybackup/po/ko.po
index acdc3da5e02..eaf91ef1e98 100644
--- a/src/bin/pg_verifybackup/po/ko.po
+++ b/src/bin/pg_verifybackup/po/ko.po
@@ -501,8 +501,8 @@ msgstr "  -s, --skip-checksums        체크섬 검사 건너뜀\n"
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=경로    WAL 파일이 있는 경로 지정\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=경로    WAL 파일이 있는 경로 지정\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/ru.po b/src/bin/pg_verifybackup/po/ru.po
index 64005feedfd..7fb0e5ab1f6 100644
--- a/src/bin/pg_verifybackup/po/ru.po
+++ b/src/bin/pg_verifybackup/po/ru.po
@@ -507,9 +507,9 @@ msgstr "  -s, --skip-checksums        пропустить проверку ко
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
 msgstr ""
-"  -w, --wal-directory=ПУТЬ    использовать заданный путь к файлам WAL\n"
+"  -w, --wal-path=ПУТЬ    использовать заданный путь к файлам WAL\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/sv.po b/src/bin/pg_verifybackup/po/sv.po
index 17240feeb5c..97125838e8c 100644
--- a/src/bin/pg_verifybackup/po/sv.po
+++ b/src/bin/pg_verifybackup/po/sv.po
@@ -492,8 +492,8 @@ msgstr "  -s, --skip-checksums        hoppa över verifiering av kontrollsummor\
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=SÖKVÄG  använd denna sökväg till WAL-filer\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=SÖKVÄG  använd denna sökväg till WAL-filer\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/uk.po b/src/bin/pg_verifybackup/po/uk.po
index 034b9764232..63f8041ab38 100644
--- a/src/bin/pg_verifybackup/po/uk.po
+++ b/src/bin/pg_verifybackup/po/uk.po
@@ -484,8 +484,8 @@ msgstr "  -s, --skip-checksums не перевіряти контрольні с
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PATH використовувати вказаний шлях для файлів WAL\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PATH використовувати вказаний шлях для файлів WAL\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_CN.po b/src/bin/pg_verifybackup/po/zh_CN.po
index b7d97c8976d..fb6fcae8b82 100644
--- a/src/bin/pg_verifybackup/po/zh_CN.po
+++ b/src/bin/pg_verifybackup/po/zh_CN.po
@@ -465,8 +465,8 @@ msgstr "  -s, --skip-checksums        跳过校验和验证\n"
 
 #: pg_verifybackup.c:919
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PATH    对WAL文件使用指定路径\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PATH    对WAL文件使用指定路径\n"
 
 #: pg_verifybackup.c:920
 #, c-format
diff --git a/src/bin/pg_verifybackup/po/zh_TW.po b/src/bin/pg_verifybackup/po/zh_TW.po
index c1b710b0a36..568f972b0bb 100644
--- a/src/bin/pg_verifybackup/po/zh_TW.po
+++ b/src/bin/pg_verifybackup/po/zh_TW.po
@@ -555,8 +555,8 @@ msgstr "  -s, --skip-checksums        跳過檢查碼驗證\n"
 
 #: pg_verifybackup.c:992
 #, c-format
-msgid "  -w, --wal-directory=PATH    use specified path for WAL files\n"
-msgstr "  -w, --wal-directory=PATH    用指定的路徑存放 WAL 檔\n"
+msgid "  -w, --wal-path=PATH    use specified path for WAL files\n"
+msgstr "  -w, --wal-path=PATH    用指定的路徑存放 WAL 檔\n"
 
 #: pg_verifybackup.c:993
 #, c-format
diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl
index babc4f0a86b..b07f80719b0 100644
--- a/src/bin/pg_verifybackup/t/007_wal.pl
+++ b/src/bin/pg_verifybackup/t/007_wal.pl
@@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ],
 command_ok(
 	[
 		'pg_verifybackup',
-		'--wal-directory' => $relocated_pg_wal,
+		'--wal-path' => $relocated_pg_wal,
 		$backup_path
 	],
-	'--wal-directory can be used to specify WAL directory');
+	'--wal-path can be used to specify WAL directory');
 
 # Move directory back to original location.
 rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!";
-- 
2.47.1

From 923a767b076e04c75f6472d2800a22ca99a31d53 Mon Sep 17 00:00:00 2001
From: Amul Sul <sulamul@gmail.com>
Date: Thu, 17 Jul 2025 16:39:36 +0530
Subject: [PATCH v1 9/9] pg_verifybackup: enabled WAL parsing for tar-format
 backup

Now that pg_waldump supports decoding from tar archives, we should
leverage this functionality to remove the previous restriction on WAL
parsing for tar-backed formats.
---
 doc/src/sgml/ref/pg_verifybackup.sgml         |  5 +-
 src/bin/pg_verifybackup/pg_verifybackup.c     | 66 +++++++++++++------
 src/bin/pg_verifybackup/t/002_algorithm.pl    |  4 --
 src/bin/pg_verifybackup/t/003_corruption.pl   |  4 +-
 src/bin/pg_verifybackup/t/008_untar.pl        |  3 +-
 src/bin/pg_verifybackup/t/010_client_untar.pl |  3 +-
 6 files changed, 50 insertions(+), 35 deletions(-)

diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml
index e9b8bfd51b1..16b50b5a4df 100644
--- a/doc/src/sgml/ref/pg_verifybackup.sgml
+++ b/doc/src/sgml/ref/pg_verifybackup.sgml
@@ -36,10 +36,7 @@ PostgreSQL documentation
    <literal>backup_manifest</literal> generated by the server at the time
    of the backup. The backup may be stored either in the "plain" or the "tar"
    format; this includes tar-format backups compressed with any algorithm
-   supported by <application>pg_basebackup</application>. However, at present,
-   <literal>WAL</literal> verification is supported only for plain-format
-   backups. Therefore, if the backup is stored in tar-format, the
-   <literal>-n, --no-parse-wal</literal> option should be used.
+   supported by <application>pg_basebackup</application>.
   </para>
 
   <para>
diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c
index 1ee400199da..4bfe6fdff16 100644
--- a/src/bin/pg_verifybackup/pg_verifybackup.c
+++ b/src/bin/pg_verifybackup/pg_verifybackup.c
@@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context,
 											  const char *fmt,...)
 			pg_attribute_printf(2, 3);
 
-static void verify_tar_backup(verifier_context *context, DIR *dir);
+static void verify_tar_backup(verifier_context *context, DIR *dir,
+							  char **base_archive_path,
+							  char **wal_archive_path);
 static void verify_plain_backup_directory(verifier_context *context,
 										  char *relpath, char *fullpath,
 										  DIR *dir);
@@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath,
 static void verify_control_file(const char *controlpath,
 								uint64 manifest_system_identifier);
 static void precheck_tar_backup_file(verifier_context *context, char *relpath,
-									 char *fullpath, SimplePtrList *tarfiles);
+									 char *fullpath, SimplePtrList *tarfiles,
+									 char **base_archive_path,
+									 char **wal_archive_path);
 static void verify_tar_file(verifier_context *context, char *relpath,
 							char *fullpath, astreamer *streamer);
 static void report_extra_backup_files(verifier_context *context);
@@ -136,6 +140,8 @@ main(int argc, char **argv)
 	bool		no_parse_wal = false;
 	bool		quiet = false;
 	char	   *wal_path = NULL;
+	char	   *base_archive_path = NULL;
+	char	   *wal_archive_path = NULL;
 	char	   *pg_waldump_path = NULL;
 	DIR		   *dir;
 
@@ -327,17 +333,6 @@ main(int argc, char **argv)
 		pfree(path);
 	}
 
-	/*
-	 * XXX: In the future, we should consider enhancing pg_waldump to read WAL
-	 * files from an archive.
-	 */
-	if (!no_parse_wal && context.format == 't')
-	{
-		pg_log_error("pg_waldump cannot read tar files");
-		pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup.");
-		exit(1);
-	}
-
 	/*
 	 * Perform the appropriate type of verification appropriate based on the
 	 * backup format. This will close 'dir'.
@@ -346,7 +341,7 @@ main(int argc, char **argv)
 		verify_plain_backup_directory(&context, NULL, context.backup_directory,
 									  dir);
 	else
-		verify_tar_backup(&context, dir);
+		verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path);
 
 	/*
 	 * The "matched" flag should now be set on every entry in the hash table.
@@ -364,9 +359,28 @@ main(int argc, char **argv)
 	if (context.format == 'p' && !context.skip_checksums)
 		verify_backup_checksums(&context);
 
-	/* By default, look for the WAL in the backup directory, too. */
+	/*
+	 * By default, WAL files are expected to be found in the backup directory
+	 * for plain-format backups. In the case of tar-format backups, if a
+	 * separate WAL archive is not found, the WAL files are most likely
+	 * included within the main data directory archive.
+	 */
 	if (wal_path == NULL)
-		wal_path = psprintf("%s/pg_wal", context.backup_directory);
+	{
+		if (context.format == 'p')
+			wal_path = psprintf("%s/pg_wal", context.backup_directory);
+		else if (wal_archive_path)
+			wal_path = wal_archive_path;
+		else if (base_archive_path)
+			wal_path = base_archive_path;
+		else
+		{
+			pg_log_error("wal archive not found");
+			pg_log_error_hint("Specify the correct path using the option -w/--wal-path."
+							  "Or you must use -n/--no-parse-wal when verifying a tar-format backup.");
+			exit(1);
+		}
+	}
 
 	/*
 	 * Try to parse the required ranges of WAL records, unless we were told
@@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier)
  * close when we're done with it.
  */
 static void
-verify_tar_backup(verifier_context *context, DIR *dir)
+verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path,
+				  char **wal_archive_path)
 {
 	struct dirent *dirent;
 	SimplePtrList tarfiles = {NULL, NULL};
@@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir)
 			char	   *fullpath;
 
 			fullpath = psprintf("%s/%s", context->backup_directory, filename);
-			precheck_tar_backup_file(context, filename, fullpath, &tarfiles);
+			precheck_tar_backup_file(context, filename, fullpath, &tarfiles,
+									 base_archive_path, wal_archive_path);
 			pfree(fullpath);
 		}
 	}
@@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir)
  *
  * The arguments to this function are mostly the same as the
  * verify_plain_backup_file. The additional argument outputs a list of valid
- * tar files.
+ * tar files, along with the full paths to the main archive and the WAL
+ * directory archive.
  */
 static void
 precheck_tar_backup_file(verifier_context *context, char *relpath,
-						 char *fullpath, SimplePtrList *tarfiles)
+						 char *fullpath, SimplePtrList *tarfiles,
+						 char **base_archive_path, char **wal_archive_path)
 {
 	struct stat sb;
 	Oid			tblspc_oid = InvalidOid;
@@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath,
 	 * extension such as .gz, .lz4, or .zst.
 	 */
 	if (strncmp("base", relpath, 4) == 0)
+	{
 		suffix = relpath + 4;
+
+		*base_archive_path = pstrdup(fullpath);
+	}
 	else if (strncmp("pg_wal", relpath, 6) == 0)
+	{
 		suffix = relpath + 6;
+
+		*wal_archive_path = pstrdup(fullpath);
+	}
 	else
 	{
 		/* Expected a <tablespaceoid>.tar file here. */
diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl
index ae16c11bc4d..4f284a9e828 100644
--- a/src/bin/pg_verifybackup/t/002_algorithm.pl
+++ b/src/bin/pg_verifybackup/t/002_algorithm.pl
@@ -30,10 +30,6 @@ sub test_checksums
 	{
 		# Add switch to get a tar-format backup
 		push @backup, ('--format' => 'tar');
-
-		# Add switch to skip WAL verification, which is not yet supported for
-		# tar-format backups
-		push @verify, ('--no-parse-wal');
 	}
 
 	# A backup with a bogus algorithm should fail.
diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl
index 1dd60f709cf..f1ebdbb46b4 100644
--- a/src/bin/pg_verifybackup/t/003_corruption.pl
+++ b/src/bin/pg_verifybackup/t/003_corruption.pl
@@ -193,10 +193,8 @@ for my $scenario (@scenario)
 			command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]);
 			chdir($cwd) || die "chdir: $!";
 
-			# Now check that the backup no longer verifies. We must use -n
-			# here, because pg_waldump can't yet read WAL from a tarfile.
 			command_fails_like(
-				[ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ],
+				[ 'pg_verifybackup', $tar_backup_path ],
 				$scenario->{'fails_like'},
 				"corrupt backup fails verification: $name");
 
diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl
index bc3d6b352ad..0cfe1f9532c 100644
--- a/src/bin/pg_verifybackup/t/008_untar.pl
+++ b/src/bin/pg_verifybackup/t/008_untar.pl
@@ -123,8 +123,7 @@ for my $tc (@test_configuration)
 		# Verify tar backup.
 		$primary->command_ok(
 			[
-				'pg_verifybackup', '--no-parse-wal',
-				'--exit-on-error', $backup_path,
+				'pg_verifybackup', '--exit-on-error', $backup_path,
 			],
 			"verify backup, compression $method");
 
diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl
index b62faeb5acf..76269a73673 100644
--- a/src/bin/pg_verifybackup/t/010_client_untar.pl
+++ b/src/bin/pg_verifybackup/t/010_client_untar.pl
@@ -137,8 +137,7 @@ for my $tc (@test_configuration)
 		# Verify tar backup.
 		$primary->command_ok(
 			[
-				'pg_verifybackup', '--no-parse-wal',
-				'--exit-on-error', $backup_path,
+				'pg_verifybackup', '--exit-on-error', $backup_path,
 			],
 			"verify backup, compression $method");
 
-- 
2.47.1

Reply via email to