On Fri, Sep 12, 2025 at 11:58 PM Robert Haas <robertmh...@gmail.com> wrote: > > Here are some review comments on v3-0004: >
Thanks for the review. My replies are below. > There doesn't seem to be any reason for > astreamer_waldump_content_new() to take an astreamer *next argument. > If you look at astreamer.h, you'll see that some astreamer_BLAH_new() > functions take such an argument, and others don't. The ones that do > forward their input to another astreamer; the ones that don't, like > astreamer_plain_writer_new(), send it somewhere else. AFAICT, this > astreamer is never going to send its output to another astreamer, so > there's no reason for this argument. > Done. > I'm also a little confused by the choice of the name > astreamer_waldump_content_new(). I would have thought this would be > something like astreamer_waldump_new() or astreamer_xlogreader_new(). > The word "content" doesn't seem to me to be adding much here, and it > invites confusion with the "content" callback. > Done -- renamed to astreamer_waldump_new(). > I think you can merge setup_astreamer() into > init_tar_archive_reader(). The only other caller is > verify_tar_archive(), but that does exactly the same additional steps > as init_tar_archive_reader(), as far as I can see. > Done. > The return statement for astreamer_wal_read is really odd: > > + return (count - nbytes) ? (count - nbytes) : -1; > Agreed, that's a bit odd. This seems to be leftover code from the experimental patch. The astreamer_wal_read() function should behave like WALRead(): it should either successfully read all the requested bytes or throw an error. Corrected in the attached version. > > I would suggest changing the name of the variable from "readBuff" to > "readBuf". There are no existing uses of readBuff in the code base. > The existing WALDumpReadPage() function has a "readBuff" argument, and I've used it that way for consistency. > I think this comment also needs improvement: > > + /* > + * Ignore existing data if the required target page > has not yet been > + * read. > + */ > + if (recptr >= endPtr) > + { > + len = 0; > + > + /* Reset the buffer */ > + resetStringInfo(astreamer_buf); > + } > > This comment is problematic for a few reasons. First, we're not > ignoring the existing data: we're throwing it out. Second, the comment > doesn't say why we're doing what we're doing, only that we're doing > it. Here's my guess at the actual explanation -- please correct me if > I'm wrong: "pg_waldump never reads the same WAL bytes more than once, > so if we're now being asked for data beyond the end of what we've > already read, that means none of the data we currently have in the > buffer will ever be consulted again. So, we can discard the existing > buffer contents and start over." By the way, if this explanation is > correct, it might be nice to add an assertion someplace that verifies > it, like asserting that we're always reading from an LSN greater than > or equal to (or exactly equal to?) the LSN immediately following the > last data we read. > Updated the comment. The similar assertion exists right before copying to the readBuff. > > Another thing that isn't so nice right now is that > verify_tar_archive() has to open and close the archive only for > init_tar_archive_reader() to be called to reopen it again just moments > later. It would be nicer to open the file just once and then keep it > open. Here again, I wonder if the separation of duties could be a bit > cleaner. > Prefer to keep those separate, assuming that reopening the file won't cause any significant harm. Let me know if you think otherwise. Attached the updated version, kindly have a look. Regards, Amul
From 8eb84b553d856bbbffda254e419152c236346848 Mon Sep 17 00:00:00 2001 From: Amul Sul <sulamul@gmail.com> Date: Tue, 24 Jun 2025 11:33:20 +0530 Subject: [PATCH v4 1/8] Refactor: pg_waldump: Move some declarations to new pg_waldump.h This change prepares for a second source file in this directory to support reading WAL from tar files. Common structures, declarations, and functions are being exported through this include file so they can be used in both files. --- src/bin/pg_waldump/pg_waldump.c | 11 ++--------- src/bin/pg_waldump/pg_waldump.h | 27 +++++++++++++++++++++++++++ 2 files changed, 29 insertions(+), 9 deletions(-) create mode 100644 src/bin/pg_waldump/pg_waldump.h diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index 13d3ec2f5be..a49b2fd96c7 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -29,6 +29,7 @@ #include "common/logging.h" #include "common/relpath.h" #include "getopt_long.h" +#include "pg_waldump.h" #include "rmgrdesc.h" #include "storage/bufpage.h" @@ -39,19 +40,11 @@ static const char *progname; -static int WalSegSz; +int WalSegSz = DEFAULT_XLOG_SEG_SIZE; static volatile sig_atomic_t time_to_stop = false; static const RelFileLocator emptyRelFileLocator = {0, 0, 0}; -typedef struct XLogDumpPrivate -{ - TimeLineID timeline; - XLogRecPtr startptr; - XLogRecPtr endptr; - bool endptr_reached; -} XLogDumpPrivate; - typedef struct XLogDumpConfig { /* display options */ diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h new file mode 100644 index 00000000000..9e62b64ead5 --- /dev/null +++ b/src/bin/pg_waldump/pg_waldump.h @@ -0,0 +1,27 @@ +/*------------------------------------------------------------------------- + * + * pg_waldump.h - decode and display WAL + * + * Copyright (c) 2013-2025, PostgreSQL Global Development Group + * + * IDENTIFICATION + * src/bin/pg_waldump/pg_waldump.h + *------------------------------------------------------------------------- + */ +#ifndef PG_WALDUMP_H +#define PG_WALDUMP_H + +#include "access/xlogdefs.h" + +extern int WalSegSz; + +/* Contains the necessary information to drive WAL decoding */ +typedef struct XLogDumpPrivate +{ + TimeLineID timeline; + XLogRecPtr startptr; + XLogRecPtr endptr; + bool endptr_reached; +} XLogDumpPrivate; + +#endif /* end of PG_WALDUMP_H */ -- 2.47.1
From 9f719d5744c293a91aca5a933b357296180281ff Mon Sep 17 00:00:00 2001 From: Amul Sul <sulamul@gmail.com> Date: Thu, 26 Jun 2025 11:42:53 +0530 Subject: [PATCH v4 2/8] Refactor: pg_waldump: Separate logic used to calculate the required read size. This refactoring prepares the codebase for an upcoming patch that will support reading WAL from tar files. The logic for calculating the required read size has been updated to handle both normal WAL files and WAL files located inside a tar archive. --- src/bin/pg_waldump/pg_waldump.c | 39 ++++++++++++++++++++++----------- 1 file changed, 26 insertions(+), 13 deletions(-) diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index a49b2fd96c7..8d0cd9e7156 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -326,6 +326,29 @@ identify_target_directory(char *directory, char *fname) return NULL; /* not reached */ } +/* Returns the size in bytes of the data to be read. */ +static inline int +required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr, + int reqLen) +{ + int count = XLOG_BLCKSZ; + + if (private->endptr != InvalidXLogRecPtr) + { + if (targetPagePtr + XLOG_BLCKSZ <= private->endptr) + count = XLOG_BLCKSZ; + else if (targetPagePtr + reqLen <= private->endptr) + count = private->endptr - targetPagePtr; + else + { + private->endptr_reached = true; + return -1; + } + } + + return count; +} + /* pg_waldump's XLogReaderRoutine->segment_open callback */ static void WALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo, @@ -383,21 +406,11 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen, XLogRecPtr targetPtr, char *readBuff) { XLogDumpPrivate *private = state->private_data; - int count = XLOG_BLCKSZ; + int count = required_read_len(private, targetPagePtr, reqLen); WALReadError errinfo; - if (private->endptr != InvalidXLogRecPtr) - { - if (targetPagePtr + XLOG_BLCKSZ <= private->endptr) - count = XLOG_BLCKSZ; - else if (targetPagePtr + reqLen <= private->endptr) - count = private->endptr - targetPagePtr; - else - { - private->endptr_reached = true; - return -1; - } - } + if (private->endptr_reached) + return -1; if (!WALRead(state, readBuff, targetPagePtr, count, private->timeline, &errinfo)) -- 2.47.1
From f51ddb6d02ef6e3383beebdd4486d10191499955 Mon Sep 17 00:00:00 2001 From: Amul Sul <sulamul@gmail.com> Date: Wed, 30 Jul 2025 12:43:30 +0530 Subject: [PATCH v4 3/8] Refactor: pg_waldump: Restructure TAP tests. Restructured some tests to run inside a loop, facilitating their re-execution for decoding WAL from tar archives. --- src/bin/pg_waldump/t/001_basic.pl | 123 ++++++++++++++++-------------- 1 file changed, 67 insertions(+), 56 deletions(-) diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl index f26d75e01cf..1b712e8d74d 100644 --- a/src/bin/pg_waldump/t/001_basic.pl +++ b/src/bin/pg_waldump/t/001_basic.pl @@ -198,28 +198,6 @@ command_like( ], qr/./, 'runs with start and end segment specified'); -command_fails_like( - [ 'pg_waldump', '--path' => $node->data_dir ], - qr/error: no start WAL location given/, - 'path option requires start location'); -command_like( - [ - 'pg_waldump', - '--path' => $node->data_dir, - '--start' => $start_lsn, - '--end' => $end_lsn, - ], - qr/./, - 'runs with path option and start and end locations'); -command_fails_like( - [ - 'pg_waldump', - '--path' => $node->data_dir, - '--start' => $start_lsn, - ], - qr/error: error in WAL record at/, - 'falling off the end of the WAL results in an error'); - command_like( [ 'pg_waldump', '--quiet', @@ -227,15 +205,6 @@ command_like( ], qr/^$/, 'no output with --quiet option'); -command_fails_like( - [ - 'pg_waldump', '--quiet', - '--path' => $node->data_dir, - '--start' => $start_lsn - ], - qr/error: error in WAL record at/, - 'errors are shown with --quiet'); - # Test for: Display a message that we're skipping data if `from` # wasn't a pointer to the start of a record. @@ -272,7 +241,6 @@ sub test_pg_waldump my $result = IPC::Run::run [ 'pg_waldump', - '--path' => $node->data_dir, '--start' => $start_lsn, '--end' => $end_lsn, @opts @@ -288,38 +256,81 @@ sub test_pg_waldump my @lines; -@lines = test_pg_waldump; -is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines'); +my @scenario = ( + { + 'path' => $node->data_dir + }); -@lines = test_pg_waldump('--limit' => 6); -is(@lines, 6, 'limit option observed'); +for my $scenario (@scenario) +{ + my $path = $scenario->{'path'}; -@lines = test_pg_waldump('--fullpage'); -is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW'); + SKIP: + { + command_fails_like( + [ 'pg_waldump', '--path' => $path ], + qr/error: no start WAL location given/, + 'path option requires start location'); + command_like( + [ + 'pg_waldump', + '--path' => $path, + '--start' => $start_lsn, + '--end' => $end_lsn, + ], + qr/./, + 'runs with path option and start and end locations'); + command_fails_like( + [ + 'pg_waldump', + '--path' => $path, + '--start' => $start_lsn, + ], + qr/error: error in WAL record at/, + 'falling off the end of the WAL results in an error'); -@lines = test_pg_waldump('--stats'); -like($lines[0], qr/WAL statistics/, "statistics on stdout"); -is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output'); + command_fails_like( + [ + 'pg_waldump', '--quiet', + '--path' => $path, + '--start' => $start_lsn + ], + qr/error: error in WAL record at/, + 'errors are shown with --quiet'); -@lines = test_pg_waldump('--stats=record'); -like($lines[0], qr/WAL statistics/, "statistics on stdout"); -is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output'); + @lines = test_pg_waldump('--path' => $path); + is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines'); -@lines = test_pg_waldump('--rmgr' => 'Btree'); -is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines'); + @lines = test_pg_waldump('--path' => $path, '--limit' => 6); + is(@lines, 6, 'limit option observed'); -@lines = test_pg_waldump('--fork' => 'init'); -is(grep(!/fork init/, @lines), 0, 'only init fork lines'); + @lines = test_pg_waldump('--path' => $path, '--fullpage'); + is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW'); -@lines = test_pg_waldump( - '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid"); -is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines), - 0, 'only lines for selected relation'); + @lines = test_pg_waldump('--path' => $path, '--stats'); + like($lines[0], qr/WAL statistics/, "statistics on stdout"); + is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output'); -@lines = test_pg_waldump( - '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid", - '--block' => 1); -is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block'); + @lines = test_pg_waldump('--path' => $path, '--stats=record'); + like($lines[0], qr/WAL statistics/, "statistics on stdout"); + is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output'); + @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree'); + is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines'); + + @lines = test_pg_waldump('--path' => $path, '--fork' => 'init'); + is(grep(!/fork init/, @lines), 0, 'only init fork lines'); + + @lines = test_pg_waldump('--path' => $path, + '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid"); + is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines), + 0, 'only lines for selected relation'); + + @lines = test_pg_waldump('--path' => $path, + '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid", + '--block' => 1); + is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block'); + } +} done_testing(); -- 2.47.1
From be0cf9b0d4ff99630298e499f93d09a17eeae141 Mon Sep 17 00:00:00 2001 From: Amul Sul <sulamul@gmail.com> Date: Wed, 16 Jul 2025 18:37:59 +0530 Subject: [PATCH v4 4/8] pg_waldump: Add support for archived WAL decoding. pg_waldump can now accept the path to a tar archive containing WAL files and decode them. This feature was added primarily for pg_verifybackup, which previously disabled WAL parsing for tar-formatted backups. Note that this patch requires that the WAL files within the archive be in sequential order; an error will be reported otherwise. The next patch is planned to remove this restriction. --- doc/src/sgml/ref/pg_waldump.sgml | 8 +- src/bin/pg_waldump/Makefile | 7 +- src/bin/pg_waldump/astreamer_waldump.c | 388 +++++++++++++++++++++++++ src/bin/pg_waldump/meson.build | 4 +- src/bin/pg_waldump/pg_waldump.c | 365 +++++++++++++++++++---- src/bin/pg_waldump/pg_waldump.h | 20 +- src/bin/pg_waldump/t/001_basic.pl | 84 +++++- src/tools/pgindent/typedefs.list | 1 + 8 files changed, 799 insertions(+), 78 deletions(-) create mode 100644 src/bin/pg_waldump/astreamer_waldump.c diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml index ce23add5577..d004bb0f67e 100644 --- a/doc/src/sgml/ref/pg_waldump.sgml +++ b/doc/src/sgml/ref/pg_waldump.sgml @@ -141,13 +141,17 @@ PostgreSQL documentation <term><option>--path=<replaceable>path</replaceable></option></term> <listitem> <para> - Specifies a directory to search for WAL segment files or a - directory with a <literal>pg_wal</literal> subdirectory that + Specifies a tar archive or a directory to search for WAL segment files + or a directory with a <literal>pg_wal</literal> subdirectory that contains such files. The default is to search in the current directory, the <literal>pg_wal</literal> subdirectory of the current directory, and the <literal>pg_wal</literal> subdirectory of <envar>PGDATA</envar>. </para> + <para> + If a tar archive is provided, its WAL segment files must be in + sequential order; otherwise, an error will be reported. + </para> </listitem> </varlistentry> diff --git a/src/bin/pg_waldump/Makefile b/src/bin/pg_waldump/Makefile index 4c1ee649501..b234613eb50 100644 --- a/src/bin/pg_waldump/Makefile +++ b/src/bin/pg_waldump/Makefile @@ -3,6 +3,9 @@ PGFILEDESC = "pg_waldump - decode and display WAL" PGAPPICON=win32 +# make these available to TAP test scripts +export TAR + subdir = src/bin/pg_waldump top_builddir = ../../.. include $(top_builddir)/src/Makefile.global @@ -12,11 +15,13 @@ OBJS = \ $(WIN32RES) \ compat.o \ pg_waldump.o \ + astreamer_waldump.o \ rmgrdesc.o \ xlogreader.o \ xlogstats.o -override CPPFLAGS := -DFRONTEND $(CPPFLAGS) +override CPPFLAGS := -DFRONTEND -I$(libpq_srcdir) $(CPPFLAGS) +LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils RMGRDESCSOURCES = $(sort $(notdir $(wildcard $(top_srcdir)/src/backend/access/rmgrdesc/*desc*.c))) RMGRDESCOBJS = $(patsubst %.c,%.o,$(RMGRDESCSOURCES)) diff --git a/src/bin/pg_waldump/astreamer_waldump.c b/src/bin/pg_waldump/astreamer_waldump.c new file mode 100644 index 00000000000..caf7da6ccb8 --- /dev/null +++ b/src/bin/pg_waldump/astreamer_waldump.c @@ -0,0 +1,388 @@ +/*------------------------------------------------------------------------- + * + * astreamer_waldump.c + * A generic facility for reading WAL data from tar archives via archive + * streamer. + * + * Portions Copyright (c) 2025, PostgreSQL Global Development Group + * + * IDENTIFICATION + * src/bin/pg_waldump/astreamer_waldump.c + * + *------------------------------------------------------------------------- + */ + +#include "postgres_fe.h" + +#include <unistd.h> + +#include "access/xlog_internal.h" +#include "access/xlogdefs.h" +#include "common/logging.h" +#include "fe_utils/simple_list.h" +#include "pg_waldump.h" + +/* + * How many bytes should we try to read from a file at once? + */ +#define READ_CHUNK_SIZE (128 * 1024) + +/* + * When nextSegNo is 0, read from any available WAL file. + */ +#define READ_ANY_WAL(mystreamer) ((mystreamer)->nextSegNo == 0) + +typedef struct astreamer_waldump +{ + /* These fields don't change once initialized. */ + astreamer base; + XLogSegNo startSegNo; + XLogSegNo endSegNo; + XLogDumpPrivate *privateInfo; + + /* These fields change with archive member. */ + bool skipThisSeg; + XLogSegNo nextSegNo; /* Next expected segment to stream */ +} astreamer_waldump; + +static int astreamer_archive_read(XLogDumpPrivate *privateInfo); +static void astreamer_waldump_content(astreamer *streamer, + astreamer_member *member, + const char *data, int len, + astreamer_archive_context context); +static void astreamer_waldump_finalize(astreamer *streamer); +static void astreamer_waldump_free(astreamer *streamer); + +static bool member_is_relevant_wal(astreamer_waldump *mystreamer, + astreamer_member *member, + TimeLineID startTimeLineID, + XLogSegNo *curSegNo); + +static const astreamer_ops astreamer_waldump_ops = { + .content = astreamer_waldump_content, + .finalize = astreamer_waldump_finalize, + .free = astreamer_waldump_free +}; + +/* + * Copies WAL data from astreamer to readBuff; if unavailable, fetches more + * from the tar archive via astreamer. + */ +int +astreamer_wal_read(char *readBuff, XLogRecPtr targetPagePtr, Size count, + XLogDumpPrivate *privateInfo) +{ + char *p = readBuff; + Size nbytes = count; + XLogRecPtr recptr = targetPagePtr; + volatile StringInfo astreamer_buf = privateInfo->archive_streamer_buf; + + while (nbytes > 0) + { + char *buf = astreamer_buf->data; + int len = astreamer_buf->len; + + /* WAL record range that the buffer contains */ + XLogRecPtr endPtr = privateInfo->archive_streamer_read_ptr; + XLogRecPtr startPtr = (endPtr > len) ? endPtr - len : 0; + + /* + * pg_waldump never ask the same WAL bytes more than once, so if we're + * now being asked for data beyond the end of what we've already read, + * that means none of the data we currently have in the buffer will + * ever be consulted again. So, we can discard the existing buffer + * contents and start over. + */ + if (recptr >= endPtr) + { + len = 0; + + /* Discard the buffered data */ + resetStringInfo(astreamer_buf); + } + + if (len > 0 && recptr > startPtr) + { + int skipBytes = 0; + + /* + * The required offset is not at the start of the archive streamer + * buffer, so skip bytes until reaching the desired offset of the + * target page. + */ + skipBytes = recptr - startPtr; + + buf += skipBytes; + len -= skipBytes; + } + + if (len > 0) + { + int readBytes = len >= nbytes ? nbytes : len; + + /* + * Ensure we are reading the correct page, unless we've received + * an invalid record pointer. In that specific case, it's + * acceptable to read any page. + */ + Assert(XLogRecPtrIsInvalid(recptr) || + (recptr >= startPtr && recptr < endPtr)); + + memcpy(p, buf, readBytes); + + /* Update state for read */ + nbytes -= readBytes; + p += readBytes; + recptr += readBytes; + } + else + { + /* Fetch more data */ + if (astreamer_archive_read(privateInfo) == 0) + { + char fname[MAXFNAMELEN]; + XLogSegNo segno; + + XLByteToSeg(targetPagePtr, segno, WalSegSz); + XLogFileName(fname, privateInfo->timeline, segno, WalSegSz); + + pg_fatal("could not find file \"%s\" in \"%s\" archive", + fname, privateInfo->archive_name); + } + } + } + + /* + * Should have either have successfully read all the requested bytes or + * reported a failure before this point. + */ + Assert(nbytes == 0); + + return count; +} + +/* + * Reads the archive and passes it to the archive streamer for decompression. + */ +static int +astreamer_archive_read(XLogDumpPrivate *privateInfo) +{ + int rc; + char *buffer; + + buffer = pg_malloc(READ_CHUNK_SIZE * sizeof(uint8)); + + /* Read more data from the tar file */ + rc = read(privateInfo->archive_fd, buffer, READ_CHUNK_SIZE); + if (rc < 0) + pg_fatal("could not read file \"%s\": %m", + privateInfo->archive_name); + + /* + * Decrypt (if required), and then parse the previously read contents of + * the tar file. + */ + if (rc > 0) + astreamer_content(privateInfo->archive_streamer, NULL, + buffer, rc, ASTREAMER_UNKNOWN); + pg_free(buffer); + + return rc; +} + +/* + * Create an astreamer that can read WAL from tar file. + */ +astreamer * +astreamer_waldump_new(XLogRecPtr startptr, XLogRecPtr endPtr, + XLogDumpPrivate *privateInfo) +{ + astreamer_waldump *streamer; + + streamer = palloc0(sizeof(astreamer_waldump)); + *((const astreamer_ops **) &streamer->base.bbs_ops) = + &astreamer_waldump_ops; + + initStringInfo(&streamer->base.bbs_buffer); + + if (XLogRecPtrIsInvalid(startptr)) + streamer->startSegNo = 0; + else + { + XLByteToSeg(startptr, streamer->startSegNo, WalSegSz); + + /* + * Initialize the record pointer to the beginning of the first + * segment; this pointer will track the WAL record reading status. + */ + XLogSegNoOffsetToRecPtr(streamer->startSegNo, 0, WalSegSz, + privateInfo->archive_streamer_read_ptr); + } + + if (XLogRecPtrIsInvalid(endPtr)) + streamer->endSegNo = UINT64_MAX; + else + XLByteToSeg(endPtr, streamer->endSegNo, WalSegSz); + + streamer->nextSegNo = streamer->startSegNo; + streamer->privateInfo = privateInfo; + + return &streamer->base; +} + +/* + * Main entry point of the archive streamer for reading WAL from a tar file. + */ +static void +astreamer_waldump_content(astreamer *streamer, astreamer_member *member, + const char *data, int len, + astreamer_archive_context context) +{ + astreamer_waldump *mystreamer = (astreamer_waldump *) streamer; + XLogDumpPrivate *privateInfo = mystreamer->privateInfo; + + Assert(context != ASTREAMER_UNKNOWN); + + switch (context) + { + case ASTREAMER_MEMBER_HEADER: + { + XLogSegNo segNo; + + pg_log_debug("pg_waldump: reading \"%s\"", member->pathname); + + mystreamer->skipThisSeg = false; + + if (!member_is_relevant_wal(mystreamer, member, + privateInfo->timeline, &segNo)) + { + mystreamer->skipThisSeg = true; + break; + } + + /* + * Further checks are skipped if any WAL file can be read. + * This typically occurs during initial verification. + */ + if (READ_ANY_WAL(mystreamer)) + break; + + /* WAL segments must be archived in order */ + if (mystreamer->nextSegNo != segNo) + { + pg_log_error("WAL files are not archived in sequential order"); + pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".", + mystreamer->nextSegNo, segNo); + exit(1); + } + + /* + * We track the reading of WAL segment records using a pointer + * that's continuously incremented by the length of the + * received data. This pointer is crucial for serving WAL page + * requests from the WAL decoding routine, so it must be + * accurate. + */ +#ifdef USE_ASSERT_CHECKING + if (mystreamer->nextSegNo != 0) + { + XLogRecPtr recPtr; + + XLogSegNoOffsetToRecPtr(segNo, 0, WalSegSz, recPtr); + Assert(privateInfo->archive_streamer_read_ptr == recPtr); + } +#endif + /* Update the next expected segment number */ + mystreamer->nextSegNo += 1; + } + break; + + case ASTREAMER_MEMBER_CONTENTS: + /* Skip this segment */ + if (mystreamer->skipThisSeg) + break; + + /* Or, copy contents to buffer */ + privateInfo->archive_streamer_read_ptr += len; + astreamer_buffer_bytes(streamer, &data, &len, len); + break; + + case ASTREAMER_MEMBER_TRAILER: + break; + + case ASTREAMER_ARCHIVE_TRAILER: + break; + + default: + /* Shouldn't happen. */ + pg_fatal("unexpected state while parsing tar file"); + } +} + +/* + * End-of-stream processing for a astreamer_waldump stream. + */ +static void +astreamer_waldump_finalize(astreamer *streamer) +{ + Assert(streamer->bbs_next == NULL); +} + +/* + * Free memory associated with a astreamer_waldump stream. + */ +static void +astreamer_waldump_free(astreamer *streamer) +{ + Assert(streamer->bbs_next == NULL); + + pfree(streamer->bbs_buffer.data); + pfree(streamer); +} + +/* + * Returns true if the archive member name matches the WAL naming format and + * the corresponding WAL segment falls within the WAL decoding target range; + * otherwise, returns false. + */ +static bool +member_is_relevant_wal(astreamer_waldump *mystreamer, astreamer_member *member, + TimeLineID startTimeLineID, XLogSegNo *curSegNo) +{ + int pathlen; + XLogSegNo segNo; + TimeLineID timeline; + char *fname; + + /* We are only interested in normal files. */ + if (member->is_directory || member->is_link) + return false; + + pathlen = strlen(member->pathname); + if (pathlen < XLOG_FNAME_LEN) + return false; + + /* WAL file could be with full path */ + fname = member->pathname + (pathlen - XLOG_FNAME_LEN); + if (!IsXLogFileName(fname)) + return false; + + /* Parse position from file */ + XLogFromFileName(fname, &timeline, &segNo, WalSegSz); + + /* No further checks are needed if any file ask to read */ + if (!READ_ANY_WAL(mystreamer)) + { + /* Ignore if the timeline is different */ + if (startTimeLineID != timeline) + return false; + + /* Skip if the current segment is not the desired one */ + if (mystreamer->startSegNo > segNo || mystreamer->endSegNo < segNo) + return false; + } + + *curSegNo = segNo; + + return true; +} diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build index 937e0d68841..2a0300dc339 100644 --- a/src/bin/pg_waldump/meson.build +++ b/src/bin/pg_waldump/meson.build @@ -3,6 +3,7 @@ pg_waldump_sources = files( 'compat.c', 'pg_waldump.c', + 'astreamer_waldump.c', 'rmgrdesc.c', ) @@ -18,7 +19,7 @@ endif pg_waldump = executable('pg_waldump', pg_waldump_sources, - dependencies: [frontend_code, lz4, zstd], + dependencies: [frontend_code, lz4, zstd, libpq], c_args: ['-DFRONTEND'], # needed for xlogreader et al kwargs: default_bin_args, ) @@ -29,6 +30,7 @@ tests += { 'sd': meson.current_source_dir(), 'bd': meson.current_build_dir(), 'tap': { + 'env': {'TAR': tar.found() ? tar.full_path() : ''}, 'tests': [ 't/001_basic.pl', 't/002_save_fullpage.pl', diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index 8d0cd9e7156..393d6bfa9ef 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -326,6 +326,148 @@ identify_target_directory(char *directory, char *fname) return NULL; /* not reached */ } +/* + * Returns true if the given file is a tar archive and outputs its compression + * algorithm. + */ +static bool +is_tar_file(const char *fname, pg_compress_algorithm *compression) +{ + int fname_len = strlen(fname); + pg_compress_algorithm compress_algo; + + /* Now, check the compression type of the tar */ + if (fname_len > 4 && + strcmp(fname + fname_len - 4, ".tar") == 0) + compress_algo = PG_COMPRESSION_NONE; + else if (fname_len > 4 && + strcmp(fname + fname_len - 4, ".tgz") == 0) + compress_algo = PG_COMPRESSION_GZIP; + else if (fname_len > 7 && + strcmp(fname + fname_len - 7, ".tar.gz") == 0) + compress_algo = PG_COMPRESSION_GZIP; + else if (fname_len > 8 && + strcmp(fname + fname_len - 8, ".tar.lz4") == 0) + compress_algo = PG_COMPRESSION_LZ4; + else if (fname_len > 8 && + strcmp(fname + fname_len - 8, ".tar.zst") == 0) + compress_algo = PG_COMPRESSION_ZSTD; + else + return false; + + *compression = compress_algo; + + return true; +} + +/* + * Initializes the tar archive reader and a temporary directory for WAL files. + */ +static void +init_tar_archive_reader(XLogDumpPrivate *private, const char *waldir, + XLogRecPtr startptr, XLogRecPtr endptr, + pg_compress_algorithm compression) +{ + int fd; + astreamer *streamer; + + /* Open tar archive and store its file descriptor */ + fd = open_file_in_directory(waldir, private->archive_name); + + if (fd < 0) + pg_fatal("could not open file \"%s\"", private->archive_name); + + private->archive_fd = fd; + + /* + * Create an appropriate chain of archive streamers for reading the given + * tar archive. + */ + streamer = astreamer_waldump_new(startptr, endptr, private); + + /* + * Final extracted WAL data will reside in this streamer. However, since + * it sits at the bottom of the stack and isn't designed to propagate data + * upward, we need to hold a pointer to its data buffer in order to copy. + */ + private->archive_streamer_buf = &streamer->bbs_buffer; + + /* Before that we must parse the tar archive. */ + streamer = astreamer_tar_parser_new(streamer); + + /* Before that we must decompress, if archive is compressed. */ + if (compression == PG_COMPRESSION_GZIP) + streamer = astreamer_gzip_decompressor_new(streamer); + else if (compression == PG_COMPRESSION_LZ4) + streamer = astreamer_lz4_decompressor_new(streamer); + else if (compression == PG_COMPRESSION_ZSTD) + streamer = astreamer_zstd_decompressor_new(streamer); + + private->archive_streamer = streamer; +} + +/* + * Release the archive streamer chain and close the archive file. + */ +static void +free_tar_archive_reader(XLogDumpPrivate *private) +{ + /* + * NB: Normally, astreamer_finalize() is called before astreamer_free() to + * flush any remaining buffered data or to ensure the end of the tar + * archive is reached. However, when decoding a WAL file, once we hit the + * end LSN, any remaining WAL data in the buffer or the tar archive's + * unreached end can be safely ignored. + */ + astreamer_free(private->archive_streamer); + + /* Close the file. */ + if (close(private->archive_fd) != 0) + pg_log_error("could not close file \"%s\": %m", + private->archive_name); +} + +/* + * Reads a WAL page from the archive and verifies WAL segment size. + */ +static void +verify_tar_archive(XLogDumpPrivate *private, const char *waldir, + pg_compress_algorithm compression) +{ + PGAlignedXLogBlock buf; + int r; + + /* Initialize the reader to stream WAL data from a tar file */ + init_tar_archive_reader(private, waldir, InvalidXLogRecPtr, + InvalidXLogRecPtr, compression); + + /* Read a wal page */ + r = astreamer_wal_read(buf.data, InvalidXLogRecPtr, XLOG_BLCKSZ, private); + + /* Set WalSegSz if WAL data is successfully read */ + if (r == XLOG_BLCKSZ) + { + XLogLongPageHeader longhdr = (XLogLongPageHeader) buf.data; + + WalSegSz = longhdr->xlp_seg_size; + + if (!IsValidWalSegSize(WalSegSz)) + { + pg_log_error(ngettext("invalid WAL segment size in WAL file \"%s\" (%d byte)", + "invalid WAL segment size in WAL file \"%s\" (%d bytes)", + WalSegSz), + private->archive_name, WalSegSz); + pg_log_error_detail("The WAL segment size must be a power of two between 1 MB and 1 GB."); + exit(1); + } + } + else + pg_fatal("could not read WAL data from \"%s\" archive: read %d of %d", + private->archive_name, r, XLOG_BLCKSZ); + + free_tar_archive_reader(private); +} + /* Returns the size in bytes of the data to be read. */ static inline int required_read_len(XLogDumpPrivate *private, XLogRecPtr targetPagePtr, @@ -406,7 +548,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen, XLogRecPtr targetPtr, char *readBuff) { XLogDumpPrivate *private = state->private_data; - int count = required_read_len(private, targetPagePtr, reqLen); + int count = required_read_len(private, targetPtr, reqLen); WALReadError errinfo; if (private->endptr_reached) @@ -436,6 +578,44 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen, return count; } +/* + * pg_waldump's XLogReaderRoutine->segment_open callback to support dumping WAL + * files from tar archives. + */ +static void +TarWALDumpOpenSegment(XLogReaderState *state, XLogSegNo nextSegNo, + TimeLineID *tli_p) +{ + /* No action needed */ +} + +/* + * pg_waldump's XLogReaderRoutine->segment_close callback. + */ +static void +TarWALDumpCloseSegment(XLogReaderState *state) +{ + /* No action needed */ +} + +/* + * pg_waldump's XLogReaderRoutine->page_read callback to support dumping WAL + * files from tar archives. + */ +static int +TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen, + XLogRecPtr targetPtr, char *readBuff) +{ + XLogDumpPrivate *private = state->private_data; + int count = required_read_len(private, targetPtr, reqLen); + + if (private->endptr_reached) + return -1; + + /* Read the WAL page from the archive streamer */ + return astreamer_wal_read(readBuff, targetPagePtr, count, private); +} + /* * Boolean to return whether the given WAL record matches a specific relation * and optionally block. @@ -773,8 +953,8 @@ usage(void) printf(_(" -F, --fork=FORK only show records that modify blocks in fork FORK;\n" " valid names are main, fsm, vm, init\n")); printf(_(" -n, --limit=N number of records to display\n")); - printf(_(" -p, --path=PATH directory in which to find WAL segment files or a\n" - " directory with a ./pg_wal that contains such files\n" + printf(_(" -p, --path=PATH tar archive or a directory in which to find WAL segment files or\n" + " a directory with a ./pg_wal that contains such files\n" " (default: current directory, ./pg_wal, $PGDATA/pg_wal)\n")); printf(_(" -q, --quiet do not print any output, except for errors\n")); printf(_(" -r, --rmgr=RMGR only show records generated by resource manager RMGR;\n" @@ -806,7 +986,10 @@ main(int argc, char **argv) XLogRecord *record; XLogRecPtr first_record; char *waldir = NULL; + char *walpath = NULL; char *errormsg; + bool is_tar = false; + pg_compress_algorithm compression; static struct option long_options[] = { {"bkp-details", no_argument, NULL, 'b'}, @@ -938,7 +1121,7 @@ main(int argc, char **argv) } break; case 'p': - waldir = pg_strdup(optarg); + walpath = pg_strdup(optarg); break; case 'q': config.quiet = true; @@ -1102,10 +1285,20 @@ main(int argc, char **argv) goto bad_argument; } - if (waldir != NULL) + if (walpath != NULL) { + /* validate path points to tar archive */ + if (is_tar_file(walpath, &compression)) + { + char *fname = NULL; + + split_path(walpath, &waldir, &fname); + + private.archive_name = fname; + is_tar = true; + } /* validate path points to directory */ - if (!verify_directory(waldir)) + else if (!verify_directory(walpath)) { pg_log_error("could not open directory \"%s\": %m", waldir); goto bad_argument; @@ -1123,46 +1316,36 @@ main(int argc, char **argv) int fd; XLogSegNo segno; + /* + * If a tar archive is passed using the --path option, all other + * arguments become unnecessary. + */ + if (is_tar) + { + pg_log_error("unnecessary command-line arguments specified with tar archive (first is \"%s\")", + argv[optind]); + goto bad_argument; + } + split_path(argv[optind], &directory, &fname); - if (waldir == NULL && directory != NULL) + if (walpath == NULL && directory != NULL) { - waldir = directory; + walpath = directory; - if (!verify_directory(waldir)) + if (!verify_directory(walpath)) pg_fatal("could not open directory \"%s\": %m", waldir); } - waldir = identify_target_directory(waldir, fname); - fd = open_file_in_directory(waldir, fname); - if (fd < 0) - pg_fatal("could not open file \"%s\"", fname); - close(fd); - - /* parse position from file */ - XLogFromFileName(fname, &private.timeline, &segno, WalSegSz); - - if (XLogRecPtrIsInvalid(private.startptr)) - XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr); - else if (!XLByteInSeg(private.startptr, segno, WalSegSz)) + if (fname != NULL && is_tar_file(fname, &compression)) { - pg_log_error("start WAL location %X/%08X is not inside file \"%s\"", - LSN_FORMAT_ARGS(private.startptr), - fname); - goto bad_argument; + private.archive_name = fname; + waldir = walpath ? pg_strdup(walpath) : pg_strdup("."); + is_tar = true; } - - /* no second file specified, set end position */ - if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr)) - XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr); - - /* parse ENDSEG if passed */ - if (optind + 1 < argc) + else { - XLogSegNo endsegno; - - /* ignore directory, already have that */ - split_path(argv[optind + 1], &directory, &fname); + waldir = identify_target_directory(walpath, fname); fd = open_file_in_directory(waldir, fname); if (fd < 0) @@ -1170,32 +1353,70 @@ main(int argc, char **argv) close(fd); /* parse position from file */ - XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz); + XLogFromFileName(fname, &private.timeline, &segno, WalSegSz); - if (endsegno < segno) - pg_fatal("ENDSEG %s is before STARTSEG %s", - argv[optind + 1], argv[optind]); + if (XLogRecPtrIsInvalid(private.startptr)) + XLogSegNoOffsetToRecPtr(segno, 0, WalSegSz, private.startptr); + else if (!XLByteInSeg(private.startptr, segno, WalSegSz)) + { + pg_log_error("start WAL location %X/%08X is not inside file \"%s\"", + LSN_FORMAT_ARGS(private.startptr), + fname); + goto bad_argument; + } - if (XLogRecPtrIsInvalid(private.endptr)) - XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz, - private.endptr); + /* no second file specified, set end position */ + if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr)) + XLogSegNoOffsetToRecPtr(segno + 1, 0, WalSegSz, private.endptr); - /* set segno to endsegno for check of --end */ - segno = endsegno; - } + /* parse ENDSEG if passed */ + if (optind + 1 < argc) + { + XLogSegNo endsegno; + + /* ignore directory, already have that */ + split_path(argv[optind + 1], &directory, &fname); + + fd = open_file_in_directory(waldir, fname); + if (fd < 0) + pg_fatal("could not open file \"%s\"", fname); + close(fd); + + /* parse position from file */ + XLogFromFileName(fname, &private.timeline, &endsegno, WalSegSz); + + if (endsegno < segno) + pg_fatal("ENDSEG %s is before STARTSEG %s", + argv[optind + 1], argv[optind]); + if (XLogRecPtrIsInvalid(private.endptr)) + XLogSegNoOffsetToRecPtr(endsegno + 1, 0, WalSegSz, + private.endptr); - if (!XLByteInSeg(private.endptr, segno, WalSegSz) && - private.endptr != (segno + 1) * WalSegSz) - { - pg_log_error("end WAL location %X/%08X is not inside file \"%s\"", - LSN_FORMAT_ARGS(private.endptr), - argv[argc - 1]); - goto bad_argument; + /* set segno to endsegno for check of --end */ + segno = endsegno; + } + + + if (!XLByteInSeg(private.endptr, segno, WalSegSz) && + private.endptr != (segno + 1) * WalSegSz) + { + pg_log_error("end WAL location %X/%08X is not inside file \"%s\"", + LSN_FORMAT_ARGS(private.endptr), + argv[argc - 1]); + goto bad_argument; + } } } - else - waldir = identify_target_directory(waldir, NULL); + else if (!is_tar) + waldir = identify_target_directory(walpath, NULL); + + /* Verify that the archive contains valid WAL files */ + if (is_tar) + { + waldir = waldir ? pg_strdup(waldir) : pg_strdup("."); + verify_tar_archive(&private, waldir, compression); + } /* we don't know what to print */ if (XLogRecPtrIsInvalid(private.startptr)) @@ -1207,12 +1428,31 @@ main(int argc, char **argv) /* done with argument parsing, do the actual work */ /* we have everything we need, start reading */ - xlogreader_state = - XLogReaderAllocate(WalSegSz, waldir, - XL_ROUTINE(.page_read = WALDumpReadPage, - .segment_open = WALDumpOpenSegment, - .segment_close = WALDumpCloseSegment), - &private); + if (is_tar) + { + /* Set up for reading tar file */ + init_tar_archive_reader(&private, waldir, private.startptr, + private.endptr, compression); + + /* Routine to decode WAL files in tar archive */ + xlogreader_state = + XLogReaderAllocate(WalSegSz, waldir, + XL_ROUTINE(.page_read = TarWALDumpReadPage, + .segment_open = TarWALDumpOpenSegment, + .segment_close = TarWALDumpCloseSegment), + &private); + } + else + { + /* Routine to decode WAL files */ + xlogreader_state = + XLogReaderAllocate(WalSegSz, waldir, + XL_ROUTINE(.page_read = WALDumpReadPage, + .segment_open = WALDumpOpenSegment, + .segment_close = WALDumpCloseSegment), + &private); + } + if (!xlogreader_state) pg_fatal("out of memory while allocating a WAL reading processor"); @@ -1321,6 +1561,9 @@ main(int argc, char **argv) XLogReaderFree(xlogreader_state); + if (is_tar) + free_tar_archive_reader(&private); + return EXIT_SUCCESS; bad_argument: diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h index 9e62b64ead5..4205e0ef597 100644 --- a/src/bin/pg_waldump/pg_waldump.h +++ b/src/bin/pg_waldump/pg_waldump.h @@ -12,6 +12,8 @@ #define PG_WALDUMP_H #include "access/xlogdefs.h" +#include "fe_utils/astreamer.h" +#include "lib/stringinfo.h" extern int WalSegSz; @@ -22,6 +24,22 @@ typedef struct XLogDumpPrivate XLogRecPtr startptr; XLogRecPtr endptr; bool endptr_reached; + + /* Fields required to read WAL from archive */ + char *archive_name; /* Tar archive name */ + int archive_fd; /* File descriptor for the open tar file */ + + astreamer *archive_streamer; + StringInfo archive_streamer_buf; /* Buffer for receiving WAL data */ + XLogRecPtr archive_streamer_read_ptr; /* Populate the buffer with records + until this record pointer */ } XLogDumpPrivate; -#endif /* end of PG_WALDUMP_H */ + +extern astreamer *astreamer_waldump_new(XLogRecPtr startptr, + XLogRecPtr endptr, + XLogDumpPrivate *privateInfo); +extern int astreamer_wal_read(char *readBuff, XLogRecPtr startptr, Size count, + XLogDumpPrivate *privateInfo); + +#endif /* end of PG_WALDUMP_H */ diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl index 1b712e8d74d..443126a9ce6 100644 --- a/src/bin/pg_waldump/t/001_basic.pl +++ b/src/bin/pg_waldump/t/001_basic.pl @@ -3,10 +3,13 @@ use strict; use warnings FATAL => 'all'; +use Cwd; use PostgreSQL::Test::Cluster; use PostgreSQL::Test::Utils; use Test::More; +my $tar = $ENV{TAR}; + program_help_ok('pg_waldump'); program_version_ok('pg_waldump'); program_options_handling_ok('pg_waldump'); @@ -235,7 +238,7 @@ command_like( sub test_pg_waldump { local $Test::Builder::Level = $Test::Builder::Level + 1; - my @opts = @_; + my ($path, @opts) = @_; my ($stdout, $stderr); @@ -243,6 +246,7 @@ sub test_pg_waldump 'pg_waldump', '--start' => $start_lsn, '--end' => $end_lsn, + '--path' => $path, @opts ], '>' => \$stdout, @@ -254,11 +258,50 @@ sub test_pg_waldump return @lines; } -my @lines; +# Create a tar archive, sorting the file order +sub generate_archive +{ + my ($archive, $directory, $compression_flags) = @_; + + my @files; + opendir my $dh, $directory or die "opendir: $!"; + while (my $entry = readdir $dh) { + # Skip '.' and '..' + next if $entry eq '.' || $entry eq '..'; + push @files, $entry; + } + closedir $dh; + + @files = sort @files; + + # move into the WAL directory before archiving files + my $cwd = getcwd; + chdir($directory) || die "chdir: $!"; + command_ok([$tar, $compression_flags, $archive, @files]); + chdir($cwd) || die "chdir: $!"; +} + +my $tmp_dir = PostgreSQL::Test::Utils::tempdir_short(); my @scenario = ( { - 'path' => $node->data_dir + 'path' => $node->data_dir, + 'is_archive' => 0, + 'enabled' => 1 + }, + { + 'path' => "$tmp_dir/pg_wal.tar", + 'compression_method' => 'none', + 'compression_flags' => '-cf', + 'is_archive' => 1, + 'enabled' => 1 + }, + { + 'path' => "$tmp_dir/pg_wal.tar.gz", + 'compression_method' => 'gzip', + 'compression_flags' => '-czf', + 'is_archive' => 1, + 'enabled' => check_pg_config("#define HAVE_LIBZ 1") }); for my $scenario (@scenario) @@ -267,6 +310,19 @@ for my $scenario (@scenario) SKIP: { + skip "tar command is not available", 3 + if !defined $tar; + skip "$scenario->{'compression_method'} compression not supported by this build", 3 + if !$scenario->{'enabled'} && $scenario->{'is_archive'}; + + # create pg_wal archive + if ($scenario->{'is_archive'}) + { + generate_archive($path, + $node->data_dir . '/pg_wal', + $scenario->{'compression_flags'}); + } + command_fails_like( [ 'pg_waldump', '--path' => $path ], qr/error: no start WAL location given/, @@ -298,38 +354,42 @@ for my $scenario (@scenario) qr/error: error in WAL record at/, 'errors are shown with --quiet'); - @lines = test_pg_waldump('--path' => $path); + my @lines; + @lines = test_pg_waldump($path); is(grep(!/^rmgr: \w/, @lines), 0, 'all output lines are rmgr lines'); - @lines = test_pg_waldump('--path' => $path, '--limit' => 6); + @lines = test_pg_waldump($path, '--limit' => 6); is(@lines, 6, 'limit option observed'); - @lines = test_pg_waldump('--path' => $path, '--fullpage'); + @lines = test_pg_waldump($path, '--fullpage'); is(grep(!/^rmgr:.*\bFPW\b/, @lines), 0, 'all output lines are FPW'); - @lines = test_pg_waldump('--path' => $path, '--stats'); + @lines = test_pg_waldump($path, '--stats'); like($lines[0], qr/WAL statistics/, "statistics on stdout"); is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output'); - @lines = test_pg_waldump('--path' => $path, '--stats=record'); + @lines = test_pg_waldump($path, '--stats=record'); like($lines[0], qr/WAL statistics/, "statistics on stdout"); is(grep(/^rmgr:/, @lines), 0, 'no rmgr lines output'); - @lines = test_pg_waldump('--path' => $path, '--rmgr' => 'Btree'); + @lines = test_pg_waldump($path, '--rmgr' => 'Btree'); is(grep(!/^rmgr: Btree/, @lines), 0, 'only Btree lines'); - @lines = test_pg_waldump('--path' => $path, '--fork' => 'init'); + @lines = test_pg_waldump($path, '--fork' => 'init'); is(grep(!/fork init/, @lines), 0, 'only init fork lines'); - @lines = test_pg_waldump('--path' => $path, + @lines = test_pg_waldump($path, '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_t1_oid"); is(grep(!/rel $default_ts_oid\/$postgres_db_oid\/$rel_t1_oid/, @lines), 0, 'only lines for selected relation'); - @lines = test_pg_waldump('--path' => $path, + @lines = test_pg_waldump($path, '--relation' => "$default_ts_oid/$postgres_db_oid/$rel_i1a_oid", '--block' => 1); is(grep(!/\bblk 1\b/, @lines), 0, 'only lines for selected block'); + + # Cleanup. + unlink $path if $scenario->{'is_archive'}; } } diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index a13e8162890..b406ca041ec 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -3444,6 +3444,7 @@ astreamer_recovery_injector astreamer_tar_archiver astreamer_tar_parser astreamer_verify +astreamer_waldump astreamer_zstd_frame auth_password_hook_typ autovac_table -- 2.47.1
From d79a505af9532baae675557de0efb1bcc35d72f5 Mon Sep 17 00:00:00 2001 From: Amul Sul <sulamul@gmail.com> Date: Mon, 25 Aug 2025 17:26:29 +0530 Subject: [PATCH v4 5/8] pg_waldump: Remove the restriction on the order of archived WAL files. With previous patch, pg_waldump would stop decoding if WAL files were not in the required sequence. With this patch, decoding will now continue. Any WAL file that is out of order will be written to a temporary location, from which it will be read later. Once a temporary file has been read, it will be removed. --- doc/src/sgml/ref/pg_waldump.sgml | 7 +- src/bin/pg_waldump/astreamer_waldump.c | 214 +++++++++++++++++++++---- src/bin/pg_waldump/pg_waldump.c | 112 ++++++++++++- src/bin/pg_waldump/pg_waldump.h | 30 +++- src/bin/pg_waldump/t/001_basic.pl | 3 +- 5 files changed, 323 insertions(+), 43 deletions(-) diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml index d004bb0f67e..c1afb4097b5 100644 --- a/doc/src/sgml/ref/pg_waldump.sgml +++ b/doc/src/sgml/ref/pg_waldump.sgml @@ -149,8 +149,11 @@ PostgreSQL documentation of <envar>PGDATA</envar>. </para> <para> - If a tar archive is provided, its WAL segment files must be in - sequential order; otherwise, an error will be reported. + If a tar archive is provided and its WAL segment files are not in + sequential order, those files will be written temporarily. These files + will be created inside the directory specified by the <envar>TMPDIR</envar> + environment variable if it is set; otherwise, the temporary files will + be created within the same directory as the tar archive itself. </para> </listitem> </varlistentry> diff --git a/src/bin/pg_waldump/astreamer_waldump.c b/src/bin/pg_waldump/astreamer_waldump.c index caf7da6ccb8..40876c77f6c 100644 --- a/src/bin/pg_waldump/astreamer_waldump.c +++ b/src/bin/pg_waldump/astreamer_waldump.c @@ -18,8 +18,8 @@ #include "access/xlog_internal.h" #include "access/xlogdefs.h" +#include "common/file_perm.h" #include "common/logging.h" -#include "fe_utils/simple_list.h" #include "pg_waldump.h" /* @@ -42,10 +42,11 @@ typedef struct astreamer_waldump /* These fields change with archive member. */ bool skipThisSeg; + bool writeThisSeg; + FILE *segFp; XLogSegNo nextSegNo; /* Next expected segment to stream */ } astreamer_waldump; -static int astreamer_archive_read(XLogDumpPrivate *privateInfo); static void astreamer_waldump_content(astreamer *streamer, astreamer_member *member, const char *data, int len, @@ -56,7 +57,12 @@ static void astreamer_waldump_free(astreamer *streamer); static bool member_is_relevant_wal(astreamer_waldump *mystreamer, astreamer_member *member, TimeLineID startTimeLineID, + char **curFname, XLogSegNo *curSegNo); +static FILE *member_prepare_tmp_write(XLogSegNo curSegNo, + const char *fname); +static XLogSegNo member_next_segno(XLogSegNo curSegNo, + TimeLineID timeline); static const astreamer_ops astreamer_waldump_ops = { .content = astreamer_waldump_content, @@ -164,7 +170,7 @@ astreamer_wal_read(char *readBuff, XLogRecPtr targetPagePtr, Size count, /* * Reads the archive and passes it to the archive streamer for decompression. */ -static int +int astreamer_archive_read(XLogDumpPrivate *privateInfo) { int rc; @@ -208,17 +214,8 @@ astreamer_waldump_new(XLogRecPtr startptr, XLogRecPtr endPtr, if (XLogRecPtrIsInvalid(startptr)) streamer->startSegNo = 0; else - { XLByteToSeg(startptr, streamer->startSegNo, WalSegSz); - /* - * Initialize the record pointer to the beginning of the first - * segment; this pointer will track the WAL record reading status. - */ - XLogSegNoOffsetToRecPtr(streamer->startSegNo, 0, WalSegSz, - privateInfo->archive_streamer_read_ptr); - } - if (XLogRecPtrIsInvalid(endPtr)) streamer->endSegNo = UINT64_MAX; else @@ -247,14 +244,16 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member, { case ASTREAMER_MEMBER_HEADER: { - XLogSegNo segNo; + char *fname; pg_log_debug("pg_waldump: reading \"%s\"", member->pathname); mystreamer->skipThisSeg = false; + mystreamer->writeThisSeg = false; if (!member_is_relevant_wal(mystreamer, member, - privateInfo->timeline, &segNo)) + privateInfo->timeline, + &fname, &privateInfo->curSegNo)) { mystreamer->skipThisSeg = true; break; @@ -267,33 +266,67 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member, if (READ_ANY_WAL(mystreamer)) break; - /* WAL segments must be archived in order */ - if (mystreamer->nextSegNo != segNo) + /* + * When WAL segments are not archived sequentially, it becomes + * necessary to write out (or preserve) segments that might be + * required at a later point. + */ + if (mystreamer->nextSegNo != privateInfo->curSegNo) { - pg_log_error("WAL files are not archived in sequential order"); - pg_log_error_detail("Expecting segment number " UINT64_FORMAT " but found " UINT64_FORMAT ".", - mystreamer->nextSegNo, segNo); - exit(1); + mystreamer->writeThisSeg = true; + mystreamer->segFp = + member_prepare_tmp_write(privateInfo->curSegNo, fname); + break; } /* - * We track the reading of WAL segment records using a pointer - * that's continuously incremented by the length of the - * received data. This pointer is crucial for serving WAL page - * requests from the WAL decoding routine, so it must be - * accurate. + * If the buffer contains data, the next WAL record must + * logically follow it. Otherwise, this file isn't the one we + * need, and we must export it. */ -#ifdef USE_ASSERT_CHECKING - if (mystreamer->nextSegNo != 0) + else if (privateInfo->archive_streamer_buf->len != 0) { XLogRecPtr recPtr; - XLogSegNoOffsetToRecPtr(segNo, 0, WalSegSz, recPtr); - Assert(privateInfo->archive_streamer_read_ptr == recPtr); + XLogSegNoOffsetToRecPtr(privateInfo->curSegNo, 0, WalSegSz, + recPtr); + + if (privateInfo->archive_streamer_read_ptr != recPtr) + { + mystreamer->writeThisSeg = true; + mystreamer->segFp = + member_prepare_tmp_write(privateInfo->curSegNo, fname); + + /* Update the next expected segment number after this */ + mystreamer->nextSegNo = + member_next_segno(privateInfo->curSegNo + 1, + privateInfo->timeline); + break; + } } -#endif + + Assert(!mystreamer->skipThisSeg); + Assert(!mystreamer->writeThisSeg); + + /* + * We are now streaming segment containt. + * + * We need to track the reading of WAL segment records using a + * pointer that's typically incremented by the length of the + * data read. However, we sometimes export the WAL file to + * temporary storage, allowing the decoding routine to read + * directly from there. This makes continuous pointer + * incrementing challenging, as file reads can occur from any + * offset, leading to potential errors. Therefore, we now + * reset the pointer when reading from a file for streaming. + */ + XLogSegNoOffsetToRecPtr(privateInfo->curSegNo, 0, WalSegSz, + privateInfo->archive_streamer_read_ptr); + /* Update the next expected segment number */ - mystreamer->nextSegNo += 1; + mystreamer->nextSegNo = + member_next_segno(privateInfo->curSegNo, + privateInfo->timeline); } break; @@ -302,12 +335,45 @@ astreamer_waldump_content(astreamer *streamer, astreamer_member *member, if (mystreamer->skipThisSeg) break; + /* Or, write contents to file */ + if (mystreamer->writeThisSeg) + { + Assert(mystreamer->segFp != NULL); + + errno = 0; + if (len > 0 && fwrite(data, len, 1, mystreamer->segFp) != 1) + { + char *fname; + int pathlen = strlen(member->pathname); + + Assert(pathlen >= XLOG_FNAME_LEN); + + fname = member->pathname + (pathlen - XLOG_FNAME_LEN); + + /* + * If write didn't set errno, assume problem is no disk + * space + */ + if (errno == 0) + errno = ENOSPC; + pg_fatal("could not write to file \"%s\": %m", + get_tmp_wal_file_path(fname)); + } + break; + } + /* Or, copy contents to buffer */ privateInfo->archive_streamer_read_ptr += len; astreamer_buffer_bytes(streamer, &data, &len, len); break; case ASTREAMER_MEMBER_TRAILER: + if (mystreamer->segFp != NULL) + { + fclose(mystreamer->segFp); + mystreamer->segFp = NULL; + } + privateInfo->curSegNo = 0; break; case ASTREAMER_ARCHIVE_TRAILER: @@ -334,8 +400,14 @@ astreamer_waldump_finalize(astreamer *streamer) static void astreamer_waldump_free(astreamer *streamer) { + astreamer_waldump *mystreamer; + Assert(streamer->bbs_next == NULL); + mystreamer = (astreamer_waldump *) streamer; + if (mystreamer->segFp != NULL) + fclose(mystreamer->segFp); + pfree(streamer->bbs_buffer.data); pfree(streamer); } @@ -347,7 +419,8 @@ astreamer_waldump_free(astreamer *streamer) */ static bool member_is_relevant_wal(astreamer_waldump *mystreamer, astreamer_member *member, - TimeLineID startTimeLineID, XLogSegNo *curSegNo) + TimeLineID startTimeLineID, char **curFname, + XLogSegNo *curSegNo) { int pathlen; XLogSegNo segNo; @@ -382,7 +455,84 @@ member_is_relevant_wal(astreamer_waldump *mystreamer, astreamer_member *member, return false; } + *curFname = fname; *curSegNo = segNo; return true; } + +/* + * Create an empty placeholder file and return its handle. The file is also + * added to an exported list for future management, e.g. access, deletion, and + * existence checks. + */ +static FILE * +member_prepare_tmp_write(XLogSegNo curSegNo, const char *fname) +{ + FILE *file; + char *fpath = get_tmp_wal_file_path(fname); + + /* Create an empty placeholder */ + file = fopen(fpath, PG_BINARY_W); + if (file == NULL) + pg_fatal("could not create file \"%s\": %m", fpath); + +#ifndef WIN32 + if (chmod(fpath, pg_file_create_mode)) + pg_fatal("could not set permissions on file \"%s\": %m", + fpath); +#endif + + pg_log_info("temporarily exporting file \"%s\"", fpath); + + /* Record this segment's export */ + simple_string_list_append(&TmpWalSegList, fname); + pfree(fpath); + + return file; +} + +/* + * Get next WAL segment that needs to be retrieved from the archive. + * + * The function checks for the presence of a previously read and extracted WAL + * segment in the temporary storage. If a temporary file is found for that + * segment, it indicates the segment has already been successfully retrieved + * from the archive. In this case, the function increments the segment number + * and repeats the check. This process continues until a segment that has not + * yet been retrieved is found, at which point the function returns its number. + */ +static XLogSegNo +member_next_segno(XLogSegNo curSegNo, TimeLineID timeline) +{ + XLogSegNo nextSegNo = curSegNo + 1; + bool exists; + + /* + * If we find a file that was previously written to the temporary space, + * it indicates that the corresponding WAL segment request has already + * been fulfilled. In that case, we increment the nextSegNo counter and + * check again whether that segment number again. if found above steps + * will be return if not then we return that segment number which would be + * needed from the archive. + */ + do + { + char fname[MAXFNAMELEN]; + + XLogFileName(fname, timeline, nextSegNo, WalSegSz); + + /* + * If the WAL segment has already been exported, increment the counter + * and check for the next segment. + */ + exists = false; + if (simple_string_list_member(&TmpWalSegList, fname)) + { + nextSegNo += 1; + exists = true; + } + } while (exists); + + return nextSegNo; +} diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index 393d6bfa9ef..615227b691c 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -43,6 +43,10 @@ static const char *progname; int WalSegSz = DEFAULT_XLOG_SEG_SIZE; static volatile sig_atomic_t time_to_stop = false; +/* Temporary exported WAL file directory and the list */ +char *TmpWalSegDir = NULL; +SimpleStringList TmpWalSegList = {NULL, NULL}; + static const RelFileLocator emptyRelFileLocator = {0, 0, 0}; typedef struct XLogDumpConfig @@ -360,6 +364,41 @@ is_tar_file(const char *fname, pg_compress_algorithm *compression) return true; } +/* + * Set up a temporary directory to temporarily store WAL segments. + */ +static void +setup_tmp_walseg_dir(const char *waldir) +{ + /* + * Use the directory specified by the TEMDIR environment variable. If it's + * not set, use the provided WAL directory. + */ + TmpWalSegDir = getenv("TMPDIR") ? + pg_strdup(getenv("TMPDIR")) : pg_strdup(waldir); + canonicalize_path(TmpWalSegDir); +} + +/* + * Removes the temporarily store WAL segments, if any at exiting. + */ +static void +remove_tmp_walseg_dir_atexit(void) +{ + SimpleStringListCell *cell; + + /* Clear out any existing temporary files */ + for (cell = TmpWalSegList.head; cell; cell = cell->next) + { + char *fpath = get_tmp_wal_file_path(cell->val); + + if (unlink(fpath) == 0) + pg_log_info("removed file \"%s\"", fpath); + pfree(fpath); + } +} + + /* * Initializes the tar archive reader and a temporary directory for WAL files. */ @@ -404,6 +443,7 @@ init_tar_archive_reader(XLogDumpPrivate *private, const char *waldir, streamer = astreamer_zstd_decompressor_new(streamer); private->archive_streamer = streamer; + private->curSegNo = 0; } /* @@ -548,7 +588,7 @@ WALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen, XLogRecPtr targetPtr, char *readBuff) { XLogDumpPrivate *private = state->private_data; - int count = required_read_len(private, targetPtr, reqLen); + int count = required_read_len(private, targetPagePtr, reqLen); WALReadError errinfo; if (private->endptr_reached) @@ -607,12 +647,70 @@ TarWALDumpReadPage(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen, XLogRecPtr targetPtr, char *readBuff) { XLogDumpPrivate *private = state->private_data; - int count = required_read_len(private, targetPtr, reqLen); + int count = required_read_len(private, targetPagePtr, reqLen); + XLogSegNo nextSegNo; if (private->endptr_reached) return -1; - /* Read the WAL page from the archive streamer */ + /* + * If the target page is in a different segment, first check for the WAL + * segment's physical existence in the temporary directory. + */ + nextSegNo = state->seg.ws_segno; + if (!XLByteInSeg(targetPagePtr, nextSegNo, WalSegSz)) + { + char fname[MAXFNAMELEN]; + char *fpath; + + if (state->seg.ws_file >= 0) + { + close(state->seg.ws_file); + state->seg.ws_file = -1; + + /* Remove this file, as it is no longer needed. */ + XLogFileName(fname, state->seg.ws_tli, nextSegNo, WalSegSz); + fpath = get_tmp_wal_file_path(fname); + pg_log_info("removing file \"%s\"", fpath); + unlink(fpath); + pfree(fpath); + } + + XLByteToSeg(targetPagePtr, nextSegNo, WalSegSz); + state->seg.ws_tli = private->timeline; + state->seg.ws_segno = nextSegNo; + + /* + * If the next segment exists, open it and continue reading from there + */ + XLogFileName(fname, private->timeline, nextSegNo, WalSegSz); + if (simple_string_list_member(&TmpWalSegList, fname)) + { + fpath = get_tmp_wal_file_path(fname); + state->seg.ws_file = open(fpath, O_RDONLY | PG_BINARY, 0); + + if (state->seg.ws_file < 0) + pg_fatal("could not open file \"%s\": %m", fpath); + pfree(fpath); + } + } + + /* Continue reading from the open WAL segment, if any */ + if (state->seg.ws_file >= 0) + { + /* + * To prevent a race condition where the archive streamer is still + * exporting a file that we are trying to read, we invoke the streamer + * to ensure enough data is available. + */ + if (private->curSegNo == state->seg.ws_segno) + astreamer_archive_read(private); + + return WALDumpReadPage(state, targetPagePtr, reqLen, targetPtr, + readBuff); + } + + /* Otherwise, read the WAL page from the archive streamer */ return astreamer_wal_read(readBuff, targetPagePtr, count, private); } @@ -1340,7 +1438,6 @@ main(int argc, char **argv) if (fname != NULL && is_tar_file(fname, &compression)) { private.archive_name = fname; - waldir = walpath ? pg_strdup(walpath) : pg_strdup("."); is_tar = true; } else @@ -1434,6 +1531,13 @@ main(int argc, char **argv) init_tar_archive_reader(&private, waldir, private.startptr, private.endptr, compression); + /* + * Setup temporary directory to store WAL segments and set up an exit + * callback to remove it upon completion. + */ + setup_tmp_walseg_dir(waldir); + atexit(remove_tmp_walseg_dir_atexit); + /* Routine to decode WAL files in tar archive */ xlogreader_state = XLogReaderAllocate(WalSegSz, waldir, diff --git a/src/bin/pg_waldump/pg_waldump.h b/src/bin/pg_waldump/pg_waldump.h index 4205e0ef597..1a1cf35e6f3 100644 --- a/src/bin/pg_waldump/pg_waldump.h +++ b/src/bin/pg_waldump/pg_waldump.h @@ -13,9 +13,14 @@ #include "access/xlogdefs.h" #include "fe_utils/astreamer.h" +#include "fe_utils/simple_list.h" #include "lib/stringinfo.h" +#define TEMP_FILE_EXT "waldump.tmp" + extern int WalSegSz; +extern char *TmpWalSegDir; +extern SimpleStringList TmpWalSegList; /* Contains the necessary information to drive WAL decoding */ typedef struct XLogDumpPrivate @@ -31,15 +36,32 @@ typedef struct XLogDumpPrivate astreamer *archive_streamer; StringInfo archive_streamer_buf; /* Buffer for receiving WAL data */ - XLogRecPtr archive_streamer_read_ptr; /* Populate the buffer with records - until this record pointer */ + XLogRecPtr archive_streamer_read_ptr; /* Populate the buffer with + * records until this record + * pointer */ + XLogSegNo curSegNo; /* Current segment being read */ } XLogDumpPrivate; +/* + * Generate the temporary WAL file path. + * + * Note that the caller is responsible to pfree it. + */ +static inline char * +get_tmp_wal_file_path(const char *fname) +{ + char *fpath = (char *) palloc(MAXPGPATH); -extern astreamer *astreamer_waldump_new(XLogRecPtr startptr, - XLogRecPtr endptr, + snprintf(fpath, MAXPGPATH, "%s/%s.%s", TmpWalSegDir, fname, + TEMP_FILE_EXT); + + return fpath; +} + +extern astreamer *astreamer_waldump_new(XLogRecPtr startptr, XLogRecPtr endptr, XLogDumpPrivate *privateInfo); extern int astreamer_wal_read(char *readBuff, XLogRecPtr startptr, Size count, XLogDumpPrivate *privateInfo); +extern int astreamer_archive_read(XLogDumpPrivate *privateInfo); #endif /* end of PG_WALDUMP_H */ diff --git a/src/bin/pg_waldump/t/001_basic.pl b/src/bin/pg_waldump/t/001_basic.pl index 443126a9ce6..d5fa1f6d28d 100644 --- a/src/bin/pg_waldump/t/001_basic.pl +++ b/src/bin/pg_waldump/t/001_basic.pl @@ -7,6 +7,7 @@ use Cwd; use PostgreSQL::Test::Cluster; use PostgreSQL::Test::Utils; use Test::More; +use List::Util qw(shuffle); my $tar = $ENV{TAR}; @@ -272,7 +273,7 @@ sub generate_archive } closedir $dh; - @files = sort @files; + @files = shuffle @files; # move into the WAL directory before archiving files my $cwd = getcwd; -- 2.47.1
From 9c768466e35384d3366abd6ce0d04b6932116256 Mon Sep 17 00:00:00 2001 From: Amul Sul <sulamul@gmail.com> Date: Wed, 16 Jul 2025 14:47:43 +0530 Subject: [PATCH v4 6/8] pg_verifybackup: Delay default WAL directory preparation. We are not sure whether to parse WAL from a directory or an archive until the backup format is known. Therefore, we delay preparing the default WAL directory until the point of parsing. This delay is harmless, as the WAL directory is not used elsewhere. --- src/bin/pg_verifybackup/pg_verifybackup.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c index 5e6c13bb921..31ebc1581fb 100644 --- a/src/bin/pg_verifybackup/pg_verifybackup.c +++ b/src/bin/pg_verifybackup/pg_verifybackup.c @@ -285,10 +285,6 @@ main(int argc, char **argv) manifest_path = psprintf("%s/backup_manifest", context.backup_directory); - /* By default, look for the WAL in the backup directory, too. */ - if (wal_directory == NULL) - wal_directory = psprintf("%s/pg_wal", context.backup_directory); - /* * Try to read the manifest. We treat any errors encountered while parsing * the manifest as fatal; there doesn't seem to be much point in trying to @@ -368,6 +364,10 @@ main(int argc, char **argv) if (context.format == 'p' && !context.skip_checksums) verify_backup_checksums(&context); + /* By default, look for the WAL in the backup directory, too. */ + if (wal_directory == NULL) + wal_directory = psprintf("%s/pg_wal", context.backup_directory); + /* * Try to parse the required ranges of WAL records, unless we were told * not to do so. -- 2.47.1
From 43c598c482171e1c5d764ead6614c95104207aa4 Mon Sep 17 00:00:00 2001 From: Amul Sul <sulamul@gmail.com> Date: Thu, 24 Jul 2025 16:37:43 +0530 Subject: [PATCH v4 7/8] pg_verifybackup: Rename the wal-directory switch to wal-path With previous patches to pg_waldump can now decode WAL directly from tar files. This means you'll be able to specify a tar archive path instead of a traditional WAL directory. To keep things consistent and more versatile, we should also generalize the input switch for pg_verifybackup. It should accept either a directory or a tar file path that contains WALs. This change will also aligning it with the existing manifest-path switch naming. --- doc/src/sgml/ref/pg_verifybackup.sgml | 2 +- src/bin/pg_verifybackup/pg_verifybackup.c | 22 +++++++++++----------- src/bin/pg_verifybackup/po/de.po | 4 ++-- src/bin/pg_verifybackup/po/el.po | 4 ++-- src/bin/pg_verifybackup/po/es.po | 4 ++-- src/bin/pg_verifybackup/po/fr.po | 4 ++-- src/bin/pg_verifybackup/po/it.po | 4 ++-- src/bin/pg_verifybackup/po/ja.po | 4 ++-- src/bin/pg_verifybackup/po/ka.po | 4 ++-- src/bin/pg_verifybackup/po/ko.po | 4 ++-- src/bin/pg_verifybackup/po/ru.po | 4 ++-- src/bin/pg_verifybackup/po/sv.po | 4 ++-- src/bin/pg_verifybackup/po/uk.po | 4 ++-- src/bin/pg_verifybackup/po/zh_CN.po | 4 ++-- src/bin/pg_verifybackup/po/zh_TW.po | 4 ++-- src/bin/pg_verifybackup/t/007_wal.pl | 4 ++-- 16 files changed, 40 insertions(+), 40 deletions(-) diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml index 61c12975e4a..e9b8bfd51b1 100644 --- a/doc/src/sgml/ref/pg_verifybackup.sgml +++ b/doc/src/sgml/ref/pg_verifybackup.sgml @@ -261,7 +261,7 @@ PostgreSQL documentation <varlistentry> <term><option>-w <replaceable class="parameter">path</replaceable></option></term> - <term><option>--wal-directory=<replaceable class="parameter">path</replaceable></option></term> + <term><option>--wal-path=<replaceable class="parameter">path</replaceable></option></term> <listitem> <para> Try to parse WAL files stored in the specified directory, rather than diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c index 31ebc1581fb..1ee400199da 100644 --- a/src/bin/pg_verifybackup/pg_verifybackup.c +++ b/src/bin/pg_verifybackup/pg_verifybackup.c @@ -93,7 +93,7 @@ static void verify_file_checksum(verifier_context *context, uint8 *buffer); static void parse_required_wal(verifier_context *context, char *pg_waldump_path, - char *wal_directory); + char *wal_path); static astreamer *create_archive_verifier(verifier_context *context, char *archive_name, Oid tblspc_oid, @@ -126,7 +126,7 @@ main(int argc, char **argv) {"progress", no_argument, NULL, 'P'}, {"quiet", no_argument, NULL, 'q'}, {"skip-checksums", no_argument, NULL, 's'}, - {"wal-directory", required_argument, NULL, 'w'}, + {"wal-path", required_argument, NULL, 'w'}, {NULL, 0, NULL, 0} }; @@ -135,7 +135,7 @@ main(int argc, char **argv) char *manifest_path = NULL; bool no_parse_wal = false; bool quiet = false; - char *wal_directory = NULL; + char *wal_path = NULL; char *pg_waldump_path = NULL; DIR *dir; @@ -221,8 +221,8 @@ main(int argc, char **argv) context.skip_checksums = true; break; case 'w': - wal_directory = pstrdup(optarg); - canonicalize_path(wal_directory); + wal_path = pstrdup(optarg); + canonicalize_path(wal_path); break; default: /* getopt_long already emitted a complaint */ @@ -365,15 +365,15 @@ main(int argc, char **argv) verify_backup_checksums(&context); /* By default, look for the WAL in the backup directory, too. */ - if (wal_directory == NULL) - wal_directory = psprintf("%s/pg_wal", context.backup_directory); + if (wal_path == NULL) + wal_path = psprintf("%s/pg_wal", context.backup_directory); /* * Try to parse the required ranges of WAL records, unless we were told * not to do so. */ if (!no_parse_wal) - parse_required_wal(&context, pg_waldump_path, wal_directory); + parse_required_wal(&context, pg_waldump_path, wal_path); /* * If everything looks OK, tell the user this, unless we were asked to @@ -1198,7 +1198,7 @@ verify_file_checksum(verifier_context *context, manifest_file *m, */ static void parse_required_wal(verifier_context *context, char *pg_waldump_path, - char *wal_directory) + char *wal_path) { manifest_data *manifest = context->manifest; manifest_wal_range *this_wal_range = manifest->first_wal_range; @@ -1208,7 +1208,7 @@ parse_required_wal(verifier_context *context, char *pg_waldump_path, char *pg_waldump_cmd; pg_waldump_cmd = psprintf("\"%s\" --quiet --path=\"%s\" --timeline=%u --start=%X/%08X --end=%X/%08X\n", - pg_waldump_path, wal_directory, this_wal_range->tli, + pg_waldump_path, wal_path, this_wal_range->tli, LSN_FORMAT_ARGS(this_wal_range->start_lsn), LSN_FORMAT_ARGS(this_wal_range->end_lsn)); fflush(NULL); @@ -1376,7 +1376,7 @@ usage(void) printf(_(" -P, --progress show progress information\n")); printf(_(" -q, --quiet do not print any output, except for errors\n")); printf(_(" -s, --skip-checksums skip checksum verification\n")); - printf(_(" -w, --wal-directory=PATH use specified path for WAL files\n")); + printf(_(" -w, --wal-path=PATH use specified path for WAL files\n")); printf(_(" -V, --version output version information, then exit\n")); printf(_(" -?, --help show this help, then exit\n")); printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT); diff --git a/src/bin/pg_verifybackup/po/de.po b/src/bin/pg_verifybackup/po/de.po index a9e24931100..9b5cd5898cf 100644 --- a/src/bin/pg_verifybackup/po/de.po +++ b/src/bin/pg_verifybackup/po/de.po @@ -785,8 +785,8 @@ msgstr " -s, --skip-checksums Überprüfung der Prüfsummen überspringe #: pg_verifybackup.c:1379 #, c-format -msgid " -w, --wal-directory=PATH use specified path for WAL files\n" -msgstr " -w, --wal-directory=PFAD angegebenen Pfad für WAL-Dateien verwenden\n" +msgid " -w, --wal-path=PATH use specified path for WAL files\n" +msgstr " -w, --wal-path=PFAD angegebenen Pfad für WAL-Dateien verwenden\n" #: pg_verifybackup.c:1380 #, c-format diff --git a/src/bin/pg_verifybackup/po/el.po b/src/bin/pg_verifybackup/po/el.po index 3e3f20c67c5..81442f51c17 100644 --- a/src/bin/pg_verifybackup/po/el.po +++ b/src/bin/pg_verifybackup/po/el.po @@ -494,8 +494,8 @@ msgstr " -s, --skip-checksums παράκαμψε την επαλήθευ #: pg_verifybackup.c:992 #, c-format -msgid " -w, --wal-directory=PATH use specified path for WAL files\n" -msgstr " -w, --wal-directory=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n" +msgid " -w, --wal-path=PATH use specified path for WAL files\n" +msgstr " -w, --wal-path=PATH χρησιμοποίησε την καθορισμένη διαδρομή για αρχεία WAL\n" #: pg_verifybackup.c:993 #, c-format diff --git a/src/bin/pg_verifybackup/po/es.po b/src/bin/pg_verifybackup/po/es.po index 0cb958f3448..7f729fa35ba 100644 --- a/src/bin/pg_verifybackup/po/es.po +++ b/src/bin/pg_verifybackup/po/es.po @@ -495,8 +495,8 @@ msgstr " -s, --skip-checksums omitir la verificación de la suma de comp #: pg_verifybackup.c:992 #, c-format -msgid " -w, --wal-directory=PATH use specified path for WAL files\n" -msgstr " -w, --wal-directory=PATH utilizar la ruta especificada para los archivos WAL\n" +msgid " -w, --wal-path=PATH use specified path for WAL files\n" +msgstr " -w, --wal-path=PATH utilizar la ruta especificada para los archivos WAL\n" #: pg_verifybackup.c:993 #, c-format diff --git a/src/bin/pg_verifybackup/po/fr.po b/src/bin/pg_verifybackup/po/fr.po index da8c72f6427..09937966fa7 100644 --- a/src/bin/pg_verifybackup/po/fr.po +++ b/src/bin/pg_verifybackup/po/fr.po @@ -498,8 +498,8 @@ msgstr " -s, --skip-checksums ignore la vérification des sommes de cont #: pg_verifybackup.c:992 #, c-format -msgid " -w, --wal-directory=PATH use specified path for WAL files\n" -msgstr " -w, --wal-directory=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n" +msgid " -w, --wal-path=PATH use specified path for WAL files\n" +msgstr " -w, --wal-path=CHEMIN utilise le chemin spécifié pour les fichiers WAL\n" #: pg_verifybackup.c:993 #, c-format diff --git a/src/bin/pg_verifybackup/po/it.po b/src/bin/pg_verifybackup/po/it.po index 317b0b71e7f..4da68d0074e 100644 --- a/src/bin/pg_verifybackup/po/it.po +++ b/src/bin/pg_verifybackup/po/it.po @@ -472,8 +472,8 @@ msgstr " -s, --skip-checksums salta la verifica del checksum\n" #: pg_verifybackup.c:911 #, c-format -msgid " -w, --wal-directory=PATH use specified path for WAL files\n" -msgstr " -w, --wal-directory=PATH usa il percorso specificato per i file WAL\n" +msgid " -w, --wal-path=PATH use specified path for WAL files\n" +msgstr " -w, --wal-path=PATH usa il percorso specificato per i file WAL\n" #: pg_verifybackup.c:912 #, c-format diff --git a/src/bin/pg_verifybackup/po/ja.po b/src/bin/pg_verifybackup/po/ja.po index c910fb236cc..a948959b54f 100644 --- a/src/bin/pg_verifybackup/po/ja.po +++ b/src/bin/pg_verifybackup/po/ja.po @@ -672,8 +672,8 @@ msgstr " -s, --skip-checksums チェックサム検証をスキップ\n" #: pg_verifybackup.c:1379 #, c-format -msgid " -w, --wal-directory=PATH use specified path for WAL files\n" -msgstr " -w, --wal-directory=PATH WALファイルに指定したパスを使用する\n" +msgid " -w, --wal-path=PATH use specified path for WAL files\n" +msgstr " -w, --wal-path=PATH WALファイルに指定したパスを使用する\n" #: pg_verifybackup.c:1380 #, c-format diff --git a/src/bin/pg_verifybackup/po/ka.po b/src/bin/pg_verifybackup/po/ka.po index 982751984c7..ef2799316a8 100644 --- a/src/bin/pg_verifybackup/po/ka.po +++ b/src/bin/pg_verifybackup/po/ka.po @@ -784,8 +784,8 @@ msgstr " -s, --skip-checksums საკონტროლო ჯამ #: pg_verifybackup.c:1379 #, c-format -msgid " -w, --wal-directory=PATH use specified path for WAL files\n" -msgstr " -w, --wal-directory=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n" +msgid " -w, --wal-path=PATH use specified path for WAL files\n" +msgstr " -w, --wal-path=ბილიკი WAL ფაილებისთვის მითითებული ბილიკის გამოყენება\n" #: pg_verifybackup.c:1380 #, c-format diff --git a/src/bin/pg_verifybackup/po/ko.po b/src/bin/pg_verifybackup/po/ko.po index acdc3da5e02..eaf91ef1e98 100644 --- a/src/bin/pg_verifybackup/po/ko.po +++ b/src/bin/pg_verifybackup/po/ko.po @@ -501,8 +501,8 @@ msgstr " -s, --skip-checksums 체크섬 검사 건너뜀\n" #: pg_verifybackup.c:992 #, c-format -msgid " -w, --wal-directory=PATH use specified path for WAL files\n" -msgstr " -w, --wal-directory=경로 WAL 파일이 있는 경로 지정\n" +msgid " -w, --wal-path=PATH use specified path for WAL files\n" +msgstr " -w, --wal-path=경로 WAL 파일이 있는 경로 지정\n" #: pg_verifybackup.c:993 #, c-format diff --git a/src/bin/pg_verifybackup/po/ru.po b/src/bin/pg_verifybackup/po/ru.po index 64005feedfd..7fb0e5ab1f6 100644 --- a/src/bin/pg_verifybackup/po/ru.po +++ b/src/bin/pg_verifybackup/po/ru.po @@ -507,9 +507,9 @@ msgstr " -s, --skip-checksums пропустить проверку ко #: pg_verifybackup.c:992 #, c-format -msgid " -w, --wal-directory=PATH use specified path for WAL files\n" +msgid " -w, --wal-path=PATH use specified path for WAL files\n" msgstr "" -" -w, --wal-directory=ПУТЬ использовать заданный путь к файлам WAL\n" +" -w, --wal-path=ПУТЬ использовать заданный путь к файлам WAL\n" #: pg_verifybackup.c:993 #, c-format diff --git a/src/bin/pg_verifybackup/po/sv.po b/src/bin/pg_verifybackup/po/sv.po index 17240feeb5c..97125838e8c 100644 --- a/src/bin/pg_verifybackup/po/sv.po +++ b/src/bin/pg_verifybackup/po/sv.po @@ -492,8 +492,8 @@ msgstr " -s, --skip-checksums hoppa över verifiering av kontrollsummor\ #: pg_verifybackup.c:992 #, c-format -msgid " -w, --wal-directory=PATH use specified path for WAL files\n" -msgstr " -w, --wal-directory=SÖKVÄG använd denna sökväg till WAL-filer\n" +msgid " -w, --wal-path=PATH use specified path for WAL files\n" +msgstr " -w, --wal-path=SÖKVÄG använd denna sökväg till WAL-filer\n" #: pg_verifybackup.c:993 #, c-format diff --git a/src/bin/pg_verifybackup/po/uk.po b/src/bin/pg_verifybackup/po/uk.po index 034b9764232..63f8041ab38 100644 --- a/src/bin/pg_verifybackup/po/uk.po +++ b/src/bin/pg_verifybackup/po/uk.po @@ -484,8 +484,8 @@ msgstr " -s, --skip-checksums не перевіряти контрольні с #: pg_verifybackup.c:992 #, c-format -msgid " -w, --wal-directory=PATH use specified path for WAL files\n" -msgstr " -w, --wal-directory=PATH використовувати вказаний шлях для файлів WAL\n" +msgid " -w, --wal-path=PATH use specified path for WAL files\n" +msgstr " -w, --wal-path=PATH використовувати вказаний шлях для файлів WAL\n" #: pg_verifybackup.c:993 #, c-format diff --git a/src/bin/pg_verifybackup/po/zh_CN.po b/src/bin/pg_verifybackup/po/zh_CN.po index b7d97c8976d..fb6fcae8b82 100644 --- a/src/bin/pg_verifybackup/po/zh_CN.po +++ b/src/bin/pg_verifybackup/po/zh_CN.po @@ -465,8 +465,8 @@ msgstr " -s, --skip-checksums 跳过校验和验证\n" #: pg_verifybackup.c:919 #, c-format -msgid " -w, --wal-directory=PATH use specified path for WAL files\n" -msgstr " -w, --wal-directory=PATH 对WAL文件使用指定路径\n" +msgid " -w, --wal-path=PATH use specified path for WAL files\n" +msgstr " -w, --wal-path=PATH 对WAL文件使用指定路径\n" #: pg_verifybackup.c:920 #, c-format diff --git a/src/bin/pg_verifybackup/po/zh_TW.po b/src/bin/pg_verifybackup/po/zh_TW.po index c1b710b0a36..568f972b0bb 100644 --- a/src/bin/pg_verifybackup/po/zh_TW.po +++ b/src/bin/pg_verifybackup/po/zh_TW.po @@ -555,8 +555,8 @@ msgstr " -s, --skip-checksums 跳過檢查碼驗證\n" #: pg_verifybackup.c:992 #, c-format -msgid " -w, --wal-directory=PATH use specified path for WAL files\n" -msgstr " -w, --wal-directory=PATH 用指定的路徑存放 WAL 檔\n" +msgid " -w, --wal-path=PATH use specified path for WAL files\n" +msgstr " -w, --wal-path=PATH 用指定的路徑存放 WAL 檔\n" #: pg_verifybackup.c:993 #, c-format diff --git a/src/bin/pg_verifybackup/t/007_wal.pl b/src/bin/pg_verifybackup/t/007_wal.pl index babc4f0a86b..b07f80719b0 100644 --- a/src/bin/pg_verifybackup/t/007_wal.pl +++ b/src/bin/pg_verifybackup/t/007_wal.pl @@ -42,10 +42,10 @@ command_ok([ 'pg_verifybackup', '--no-parse-wal', $backup_path ], command_ok( [ 'pg_verifybackup', - '--wal-directory' => $relocated_pg_wal, + '--wal-path' => $relocated_pg_wal, $backup_path ], - '--wal-directory can be used to specify WAL directory'); + '--wal-path can be used to specify WAL directory'); # Move directory back to original location. rename($relocated_pg_wal, $original_pg_wal) || die "rename pg_wal back: $!"; -- 2.47.1
From ca9b3ccec8143a53c2dbcae3a11a0edf09f5b96b Mon Sep 17 00:00:00 2001 From: Amul Sul <sulamul@gmail.com> Date: Thu, 17 Jul 2025 16:39:36 +0530 Subject: [PATCH v4 8/8] pg_verifybackup: enabled WAL parsing for tar-format backup Now that pg_waldump supports decoding from tar archives, we should leverage this functionality to remove the previous restriction on WAL parsing for tar-backed formats. --- doc/src/sgml/ref/pg_verifybackup.sgml | 5 +- src/bin/pg_verifybackup/pg_verifybackup.c | 66 +++++++++++++------ src/bin/pg_verifybackup/t/002_algorithm.pl | 4 -- src/bin/pg_verifybackup/t/003_corruption.pl | 4 +- src/bin/pg_verifybackup/t/008_untar.pl | 3 +- src/bin/pg_verifybackup/t/010_client_untar.pl | 3 +- 6 files changed, 50 insertions(+), 35 deletions(-) diff --git a/doc/src/sgml/ref/pg_verifybackup.sgml b/doc/src/sgml/ref/pg_verifybackup.sgml index e9b8bfd51b1..16b50b5a4df 100644 --- a/doc/src/sgml/ref/pg_verifybackup.sgml +++ b/doc/src/sgml/ref/pg_verifybackup.sgml @@ -36,10 +36,7 @@ PostgreSQL documentation <literal>backup_manifest</literal> generated by the server at the time of the backup. The backup may be stored either in the "plain" or the "tar" format; this includes tar-format backups compressed with any algorithm - supported by <application>pg_basebackup</application>. However, at present, - <literal>WAL</literal> verification is supported only for plain-format - backups. Therefore, if the backup is stored in tar-format, the - <literal>-n, --no-parse-wal</literal> option should be used. + supported by <application>pg_basebackup</application>. </para> <para> diff --git a/src/bin/pg_verifybackup/pg_verifybackup.c b/src/bin/pg_verifybackup/pg_verifybackup.c index 1ee400199da..4bfe6fdff16 100644 --- a/src/bin/pg_verifybackup/pg_verifybackup.c +++ b/src/bin/pg_verifybackup/pg_verifybackup.c @@ -74,7 +74,9 @@ pg_noreturn static void report_manifest_error(JsonManifestParseContext *context, const char *fmt,...) pg_attribute_printf(2, 3); -static void verify_tar_backup(verifier_context *context, DIR *dir); +static void verify_tar_backup(verifier_context *context, DIR *dir, + char **base_archive_path, + char **wal_archive_path); static void verify_plain_backup_directory(verifier_context *context, char *relpath, char *fullpath, DIR *dir); @@ -83,7 +85,9 @@ static void verify_plain_backup_file(verifier_context *context, char *relpath, static void verify_control_file(const char *controlpath, uint64 manifest_system_identifier); static void precheck_tar_backup_file(verifier_context *context, char *relpath, - char *fullpath, SimplePtrList *tarfiles); + char *fullpath, SimplePtrList *tarfiles, + char **base_archive_path, + char **wal_archive_path); static void verify_tar_file(verifier_context *context, char *relpath, char *fullpath, astreamer *streamer); static void report_extra_backup_files(verifier_context *context); @@ -136,6 +140,8 @@ main(int argc, char **argv) bool no_parse_wal = false; bool quiet = false; char *wal_path = NULL; + char *base_archive_path = NULL; + char *wal_archive_path = NULL; char *pg_waldump_path = NULL; DIR *dir; @@ -327,17 +333,6 @@ main(int argc, char **argv) pfree(path); } - /* - * XXX: In the future, we should consider enhancing pg_waldump to read WAL - * files from an archive. - */ - if (!no_parse_wal && context.format == 't') - { - pg_log_error("pg_waldump cannot read tar files"); - pg_log_error_hint("You must use -n/--no-parse-wal when verifying a tar-format backup."); - exit(1); - } - /* * Perform the appropriate type of verification appropriate based on the * backup format. This will close 'dir'. @@ -346,7 +341,7 @@ main(int argc, char **argv) verify_plain_backup_directory(&context, NULL, context.backup_directory, dir); else - verify_tar_backup(&context, dir); + verify_tar_backup(&context, dir, &base_archive_path, &wal_archive_path); /* * The "matched" flag should now be set on every entry in the hash table. @@ -364,9 +359,28 @@ main(int argc, char **argv) if (context.format == 'p' && !context.skip_checksums) verify_backup_checksums(&context); - /* By default, look for the WAL in the backup directory, too. */ + /* + * By default, WAL files are expected to be found in the backup directory + * for plain-format backups. In the case of tar-format backups, if a + * separate WAL archive is not found, the WAL files are most likely + * included within the main data directory archive. + */ if (wal_path == NULL) - wal_path = psprintf("%s/pg_wal", context.backup_directory); + { + if (context.format == 'p') + wal_path = psprintf("%s/pg_wal", context.backup_directory); + else if (wal_archive_path) + wal_path = wal_archive_path; + else if (base_archive_path) + wal_path = base_archive_path; + else + { + pg_log_error("wal archive not found"); + pg_log_error_hint("Specify the correct path using the option -w/--wal-path." + "Or you must use -n/--no-parse-wal when verifying a tar-format backup."); + exit(1); + } + } /* * Try to parse the required ranges of WAL records, unless we were told @@ -787,7 +801,8 @@ verify_control_file(const char *controlpath, uint64 manifest_system_identifier) * close when we're done with it. */ static void -verify_tar_backup(verifier_context *context, DIR *dir) +verify_tar_backup(verifier_context *context, DIR *dir, char **base_archive_path, + char **wal_archive_path) { struct dirent *dirent; SimplePtrList tarfiles = {NULL, NULL}; @@ -816,7 +831,8 @@ verify_tar_backup(verifier_context *context, DIR *dir) char *fullpath; fullpath = psprintf("%s/%s", context->backup_directory, filename); - precheck_tar_backup_file(context, filename, fullpath, &tarfiles); + precheck_tar_backup_file(context, filename, fullpath, &tarfiles, + base_archive_path, wal_archive_path); pfree(fullpath); } } @@ -875,11 +891,13 @@ verify_tar_backup(verifier_context *context, DIR *dir) * * The arguments to this function are mostly the same as the * verify_plain_backup_file. The additional argument outputs a list of valid - * tar files. + * tar files, along with the full paths to the main archive and the WAL + * directory archive. */ static void precheck_tar_backup_file(verifier_context *context, char *relpath, - char *fullpath, SimplePtrList *tarfiles) + char *fullpath, SimplePtrList *tarfiles, + char **base_archive_path, char **wal_archive_path) { struct stat sb; Oid tblspc_oid = InvalidOid; @@ -918,9 +936,17 @@ precheck_tar_backup_file(verifier_context *context, char *relpath, * extension such as .gz, .lz4, or .zst. */ if (strncmp("base", relpath, 4) == 0) + { suffix = relpath + 4; + + *base_archive_path = pstrdup(fullpath); + } else if (strncmp("pg_wal", relpath, 6) == 0) + { suffix = relpath + 6; + + *wal_archive_path = pstrdup(fullpath); + } else { /* Expected a <tablespaceoid>.tar file here. */ diff --git a/src/bin/pg_verifybackup/t/002_algorithm.pl b/src/bin/pg_verifybackup/t/002_algorithm.pl index ae16c11bc4d..4f284a9e828 100644 --- a/src/bin/pg_verifybackup/t/002_algorithm.pl +++ b/src/bin/pg_verifybackup/t/002_algorithm.pl @@ -30,10 +30,6 @@ sub test_checksums { # Add switch to get a tar-format backup push @backup, ('--format' => 'tar'); - - # Add switch to skip WAL verification, which is not yet supported for - # tar-format backups - push @verify, ('--no-parse-wal'); } # A backup with a bogus algorithm should fail. diff --git a/src/bin/pg_verifybackup/t/003_corruption.pl b/src/bin/pg_verifybackup/t/003_corruption.pl index 1dd60f709cf..f1ebdbb46b4 100644 --- a/src/bin/pg_verifybackup/t/003_corruption.pl +++ b/src/bin/pg_verifybackup/t/003_corruption.pl @@ -193,10 +193,8 @@ for my $scenario (@scenario) command_ok([ $tar, '-cf' => "$tar_backup_path/base.tar", '.' ]); chdir($cwd) || die "chdir: $!"; - # Now check that the backup no longer verifies. We must use -n - # here, because pg_waldump can't yet read WAL from a tarfile. command_fails_like( - [ 'pg_verifybackup', '--no-parse-wal', $tar_backup_path ], + [ 'pg_verifybackup', $tar_backup_path ], $scenario->{'fails_like'}, "corrupt backup fails verification: $name"); diff --git a/src/bin/pg_verifybackup/t/008_untar.pl b/src/bin/pg_verifybackup/t/008_untar.pl index bc3d6b352ad..0cfe1f9532c 100644 --- a/src/bin/pg_verifybackup/t/008_untar.pl +++ b/src/bin/pg_verifybackup/t/008_untar.pl @@ -123,8 +123,7 @@ for my $tc (@test_configuration) # Verify tar backup. $primary->command_ok( [ - 'pg_verifybackup', '--no-parse-wal', - '--exit-on-error', $backup_path, + 'pg_verifybackup', '--exit-on-error', $backup_path, ], "verify backup, compression $method"); diff --git a/src/bin/pg_verifybackup/t/010_client_untar.pl b/src/bin/pg_verifybackup/t/010_client_untar.pl index b62faeb5acf..76269a73673 100644 --- a/src/bin/pg_verifybackup/t/010_client_untar.pl +++ b/src/bin/pg_verifybackup/t/010_client_untar.pl @@ -137,8 +137,7 @@ for my $tc (@test_configuration) # Verify tar backup. $primary->command_ok( [ - 'pg_verifybackup', '--no-parse-wal', - '--exit-on-error', $backup_path, + 'pg_verifybackup', '--exit-on-error', $backup_path, ], "verify backup, compression $method"); -- 2.47.1