I noticed that pg_resetwal has poor test coverage. There are some TAP tests, but they all run with -n, so they don't actually test the full functionality. (There is a non-dry-run call of pg_resetwal in the recovery test suite, but that is incidental.)

So I added a bunch of more tests to test all the different options and scenarios. That also led to adding more documentation about more details how some of the options behave, and some tweaks in the program output.

It's attached as one big patch, but it could be split into smaller pieces, depending what gets agreement.
From a8f1f7b84d38d5130f3bd8426be39cbc2c6866c3 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <pe...@eisentraut.org>
Date: Wed, 30 Aug 2023 14:00:01 +0200
Subject: [PATCH v1] pg_resetwal: More tests, and some output improvements

- More tests and test coverage
- Update an obsolete comment
- Regroup --help output
- Improve documentation about pg_resetwal -f option
- Improve error output
---
 doc/src/sgml/ref/pg_resetwal.sgml      |  70 ++++++++++++---
 src/bin/pg_resetwal/pg_resetwal.c      |  58 +++++++------
 src/bin/pg_resetwal/t/001_basic.pl     | 114 +++++++++++++++++++++++++
 src/bin/pg_resetwal/t/002_corrupted.pl |   4 +
 4 files changed, 209 insertions(+), 37 deletions(-)

diff --git a/doc/src/sgml/ref/pg_resetwal.sgml 
b/doc/src/sgml/ref/pg_resetwal.sgml
index fd539f5604..d47f9552b8 100644
--- a/doc/src/sgml/ref/pg_resetwal.sgml
+++ b/doc/src/sgml/ref/pg_resetwal.sgml
@@ -52,21 +52,33 @@ <title>Description</title>
   </para>
 
   <para>
-   After running this command, it should be possible to start the server,
+   Some options, such as <option>--wal-segsize</option> (see below), can also
+   be used to modify certain global settings of a database cluster without the
+   need to rerun <command>initdb</command>.  This can be done safely on an
+   otherwise sound database cluster, if none of the dangerous modes mentioned
+   below are used.
+  </para>
+
+  <para>
+   If <command>pg_resetwal</command> is used on a data directory where the
+   server has been cleanly shut down and the control file is sound, then it
+   will have no effect on the contents of the database system, except that no
+   longer used WAL files are cleared away.  Any other use is potentially
+   dangerous and must be done with great care.  <command>pg_resetwal</command>
+   will require the <option>-f</option> (force) option to be specified before
+   working on a data directory in an unclean shutdown state or with a corrupt
+   control file.
+  </para>
+
+  <para>
+   After running this command on a data directory with corrupted WAL or a
+   corrupt control file, it should be possible to start the server,
    but bear in mind that the database might contain inconsistent data due to
    partially-committed transactions.  You should immediately dump your data,
    run <command>initdb</command>, and restore.  After restore, check for
    inconsistencies and repair as needed.
   </para>
 
-  <para>
-   This utility can only be run by the user who installed the server, because
-   it requires read/write access to the data directory.
-   For safety reasons, you must specify the data directory on the command line.
-   <command>pg_resetwal</command> does not use the environment variable
-   <envar>PGDATA</envar>.
-  </para>
-
   <para>
    If <command>pg_resetwal</command> complains that it cannot determine
    valid data for <filename>pg_control</filename>, you can force it to proceed 
anyway
@@ -82,19 +94,41 @@ <title>Description</title>
    execute any data-modifying operations in the database before you dump,
    as any such action is likely to make the corruption worse.
   </para>
+
+  <para>
+   This utility can only be run by the user who installed the server, because
+   it requires read/write access to the data directory.
+  </para>
  </refsect1>
 
  <refsect1>
   <title>Options</title>
 
   <variablelist>
+   <varlistentry>
+    <term><replaceable class="parameter">datadir</replaceable></term>
+    <term><option>-D <replaceable 
class="parameter">datadir</replaceable></option></term>
+    <term><option>--pgdata=<replaceable 
class="parameter">datadir</replaceable></option></term>
+    <listitem>
+     <para>
+      Specifies the location of the database directory.
+      For safety reasons, you must specify the data directory on the command
+      line.  <command>pg_resetwal</command> does not use the environment
+      variable <envar>PGDATA</envar>.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><option>-f</option></term>
     <term><option>--force</option></term>
     <listitem>
      <para>
-      Force <command>pg_resetwal</command> to proceed even if it cannot 
determine
-      valid data for <filename>pg_control</filename>, as explained above.
+      Force <command>pg_resetwal</command> to proceed even in situations where
+      it could be dangerous, as explained above.  Specifically, this option is
+      required to proceed if the server had not been cleanly shut down, or if
+      <command>pg_resetwal</command> cannot determine valid data for
+      <filename>pg_control</filename>.
      </para>
     </listitem>
    </varlistentry>
@@ -132,7 +166,8 @@ <title>Options</title>
    <command>pg_resetwal</command> is unable to determine appropriate values
    by reading <filename>pg_control</filename>.  Safe values can be determined 
as
    described below.  For values that take numeric arguments, hexadecimal
-   values can be specified by using the prefix <literal>0x</literal>.
+   values can be specified by using the prefix <literal>0x</literal>. Note
+   that these instructions only apply with the standard block size of 8 kB.
   </para>
 
   <variablelist>
@@ -155,6 +190,7 @@ <title>Options</title>
       greatest file name in the same directory.  The file names are in
       hexadecimal.
      </para>
+      <!-- FIXME: multiplier? -->
     </listitem>
    </varlistentry>
 
@@ -238,6 +274,7 @@ <title>Options</title>
       names are in hexadecimal, so the easiest way to do this is to specify
       the option value in hexadecimal and append four zeroes.
      </para>
+     <!-- 65536 = SLRU_PAGES_PER_SEGMENT * BLCKSZ / sizeof(MultiXactOffset) -->
     </listitem>
    </varlistentry>
 
@@ -272,6 +309,7 @@ <title>Options</title>
       The file names are in hexadecimal.  There is no simple recipe such as
       the ones for other options of appending zeroes.
      </para>
+     <!-- 52352 = SLRU_PAGES_PER_SEGMENT * floor(BLCKSZ/20) * 4; see 
multixact.c -->
     </listitem>
    </varlistentry>
 
@@ -284,6 +322,12 @@ <title>Options</title>
       linkend="app-initdb"/> for more information.
      </para>
 
+     <para>
+      This option can also be used to change the WAL segment size of an
+      existing database cluster, avoiding the need to
+      re-<command>initdb</command>.
+     </para>
+
      <note>
       <para>
        While <command>pg_resetwal</command> will set the WAL starting address
@@ -314,6 +358,7 @@ <title>Options</title>
       in <filename>pg_xact</filename>, <literal>-u 0x700000</literal> will 
work (five
       trailing zeroes provide the proper multiplier).
      </para>
+     <!-- 1048576 = SLRU_PAGES_PER_SEGMENT * BLCKSZ * CLOG_XACTS_PER_BYTE -->
     </listitem>
    </varlistentry>
 
@@ -335,6 +380,7 @@ <title>Options</title>
       in <filename>pg_xact</filename>, <literal>-x 0x1200000</literal> will 
work (five
       trailing zeroes provide the proper multiplier).
      </para>
+     <!-- 1048576 = SLRU_PAGES_PER_SEGMENT * BLCKSZ * CLOG_XACTS_PER_BYTE -->
     </listitem>
    </varlistentry>
   </variablelist>
diff --git a/src/bin/pg_resetwal/pg_resetwal.c 
b/src/bin/pg_resetwal/pg_resetwal.c
index 9bebc2a995..bb0df8221f 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -6,8 +6,7 @@
  *
  * The theory of operation is fairly simple:
  *       1. Read the existing pg_control (which will include the last
- *              checkpoint record).  If it is an old format then update to
- *              current format.
+ *              checkpoint record).
  *       2. If pg_control is corrupt, attempt to intuit reasonable values,
  *              by scanning the old xlog if necessary.
  *       3. Modify pg_control to reflect a "shutdown" state with a checkpoint
@@ -212,6 +211,7 @@ main(int argc, char *argv[])
                                        exit(1);
                                }
 
+                               // FIXME: why 2?
                                if (set_oldest_commit_ts_xid < 2 &&
                                        set_oldest_commit_ts_xid != 0)
                                        pg_fatal("transaction ID (-c) must be 
either 0 or greater than or equal to 2");
@@ -346,6 +346,10 @@ main(int argc, char *argv[])
 
        get_restricted_token();
 
+       if (chdir(DataDir) < 0)
+               pg_fatal("could not change directory to \"%s\": %m",
+                                DataDir);
+
        /* Set mask based on PGDATA permissions */
        if (!GetDataDirectoryCreatePerm(DataDir))
                pg_fatal("could not read permissions of directory \"%s\": %m",
@@ -353,10 +357,6 @@ main(int argc, char *argv[])
 
        umask(pg_mode_mask);
 
-       if (chdir(DataDir) < 0)
-               pg_fatal("could not change directory to \"%s\": %m",
-                                DataDir);
-
        /* Check that data directory matches our server version */
        CheckDataVersion();
 
@@ -459,20 +459,22 @@ main(int argc, char *argv[])
        if (minXlogSegNo > newXlogSegNo)
                newXlogSegNo = minXlogSegNo;
 
+       if (noupdate)
+       {
+               PrintNewControlValues();
+               exit(0);
+       }
+
        /*
         * If we had to guess anything, and -f was not given, just print the
-        * guessed values and exit.  Also print if -n is given.
+        * guessed values and exit.
         */
-       if ((guessed && !force) || noupdate)
+       if (guessed && !force)
        {
                PrintNewControlValues();
-               if (!noupdate)
-               {
-                       printf(_("\nIf these values seem acceptable, use -f to 
force reset.\n"));
-                       exit(1);
-               }
-               else
-                       exit(0);
+               pg_log_error("not proceeding because control file values were 
guessed");
+               pg_log_error_hint("If these values seem acceptable, use -f to 
force reset.");
+               exit(1);
        }
 
        /*
@@ -480,9 +482,9 @@ main(int argc, char *argv[])
         */
        if (ControlFile.state != DB_SHUTDOWNED && !force)
        {
-               printf(_("The database server was not shut down cleanly.\n"
-                                "Resetting the write-ahead log might cause 
data to be lost.\n"
-                                "If you want to proceed anyway, use -f to 
force reset.\n"));
+               pg_log_error("database server was not shut down cleanly");
+               pg_log_error_detail("Resetting the write-ahead log might cause 
data to be lost.");
+               pg_log_error_hint("If you want to proceed anyway, use -f to 
force reset.");
                exit(1);
        }
 
@@ -1129,24 +1131,30 @@ static void
 usage(void)
 {
        printf(_("%s resets the PostgreSQL write-ahead log.\n\n"), progname);
-       printf(_("Usage:\n  %s [OPTION]... DATADIR\n\n"), progname);
-       printf(_("Options:\n"));
+       printf(_("Usage:\n"));
+       printf(_("  %s [OPTION]... DATADIR\n"), progname);
+
+       printf(_("\nOptions:\n"));
+       printf(_(" [-D, --pgdata=]DATADIR  data directory\n"));
+       printf(_("  -f, --force            force update to be done even after 
unclean shutdown or\n"
+                        "                         if pg_control values had to 
be guessed\n"));
+       printf(_("  -n, --dry-run          no update, just show what would be 
done\n"));
+       printf(_("  -V, --version          output version information, then 
exit\n"));
+       printf(_("  -?, --help             show this help, then exit\n"));
+
+       printf(_("\nOptions to override control file values:\n"));
        printf(_("  -c, --commit-timestamp-ids=XID,XID\n"
                         "                                   set oldest and 
newest transactions bearing\n"
                         "                                   commit timestamp 
(zero means no change)\n"));
-       printf(_(" [-D, --pgdata=]DATADIR            data directory\n"));
        printf(_("  -e, --epoch=XIDEPOCH             set next transaction ID 
epoch\n"));
-       printf(_("  -f, --force                      force update to be 
done\n"));
        printf(_("  -l, --next-wal-file=WALFILE      set minimum starting 
location for new WAL\n"));
        printf(_("  -m, --multixact-ids=MXID,MXID    set next and oldest 
multitransaction ID\n"));
-       printf(_("  -n, --dry-run                    no update, just show what 
would be done\n"));
        printf(_("  -o, --next-oid=OID               set next OID\n"));
        printf(_("  -O, --multixact-offset=OFFSET    set next multitransaction 
offset\n"));
        printf(_("  -u, --oldest-transaction-id=XID  set oldest transaction 
ID\n"));
-       printf(_("  -V, --version                    output version 
information, then exit\n"));
        printf(_("  -x, --next-transaction-id=XID    set next transaction 
ID\n"));
        printf(_("      --wal-segsize=SIZE           size of WAL segments, in 
megabytes\n"));
-       printf(_("  -?, --help                       show this help, then 
exit\n"));
+
        printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
        printf(_("%s home page: <%s>\n"), PACKAGE_NAME, PACKAGE_URL);
 }
diff --git a/src/bin/pg_resetwal/t/001_basic.pl 
b/src/bin/pg_resetwal/t/001_basic.pl
index 7e5efbf56b..001bb455ef 100644
--- a/src/bin/pg_resetwal/t/001_basic.pl
+++ b/src/bin/pg_resetwal/t/001_basic.pl
@@ -14,6 +14,7 @@
 
 my $node = PostgreSQL::Test::Cluster->new('main');
 $node->init;
+$node->append_conf('postgresql.conf', 'track_commit_timestamp = on');
 
 command_like([ 'pg_resetwal', '-n', $node->data_dir ],
        qr/checkpoint/, 'pg_resetwal -n produces output');
@@ -29,4 +30,117 @@
                'check PGDATA permissions');
 }
 
+command_ok([ 'pg_resetwal', '-D', $node->data_dir ], 'pg_resetwal runs');
+$node->start;
+is($node->safe_psql("postgres", "SELECT 1;"), 1, 'server running and working 
after reset');
+
+command_fails_like([ 'pg_resetwal', $node->data_dir ], qr/lock file .* 
exists/, 'fails if server running');
+
+$node->stop('immediate');
+command_fails_like([ 'pg_resetwal', $node->data_dir ], qr/database server was 
not shut down cleanly/, 'does not run after immediate shutdown');
+command_ok([ 'pg_resetwal', '-f', $node->data_dir ], 'runs after immediate 
shutdown with force');
+$node->start;
+is($node->safe_psql("postgres", "SELECT 1;"), 1, 'server running and working 
after forced reset');
+
+$node->stop;
+
+# check various command-line handling
+
+command_fails_like([ 'pg_resetwal', 'foo' ], qr/error: could not change 
directory/, 'fails with nonexistent data directory');
+command_fails_like([ 'pg_resetwal', 'foo', 'bar'], qr/too many command-line 
arguments/, 'fails with too many command-line arguments');
+
+$ENV{PGDATA} = $node->data_dir;  # not used
+command_fails_like([ 'pg_resetwal' ], qr/no data directory specified/, 'fails 
with too few command-line arguments');
+
+# error cases
+# -c
+command_fails_like([ 'pg_resetwal', '-c', 'foo', $node->data_dir], qr/error: 
invalid argument for option -c/, 'fails with incorrect -c option');
+command_fails_like([ 'pg_resetwal', '-c', '10,bar', $node->data_dir], 
qr/error: invalid argument for option -c/, 'fails with incorrect -c option part 
2');
+command_fails_like([ 'pg_resetwal', '-c', '1,10', $node->data_dir], qr/greater 
than/, 'fails with -c value 1 part 1');
+command_fails_like([ 'pg_resetwal', '-c', '10,1', $node->data_dir], qr/greater 
than/, 'fails with -c value 1 part 2');
+# -e
+command_fails_like([ 'pg_resetwal', '-e', 'foo', $node->data_dir], qr/error: 
invalid argument for option -e/, 'fails with incorrect -e option');
+command_fails_like([ 'pg_resetwal', '-e', '-1', $node->data_dir], qr/must not 
be -1/, 'fails with -e value -1');
+# -l
+command_fails_like([ 'pg_resetwal', '-l', 'foo', $node->data_dir], qr/error: 
invalid argument for option -l/, 'fails with incorrect -l option');
+# -m
+command_fails_like([ 'pg_resetwal', '-m', 'foo', $node->data_dir], qr/error: 
invalid argument for option -m/, 'fails with incorrect -m option');
+command_fails_like([ 'pg_resetwal', '-m', '10,bar', $node->data_dir], 
qr/error: invalid argument for option -m/, 'fails with incorrect -m option part 
2');
+command_fails_like([ 'pg_resetwal', '-m', '0,10', $node->data_dir], qr/must 
not be 0/, 'fails with -m value 0 part 1');
+command_fails_like([ 'pg_resetwal', '-m', '10,0', $node->data_dir], qr/must 
not be 0/, 'fails with -m value 0 part 2');
+# -o
+command_fails_like([ 'pg_resetwal', '-o', 'foo', $node->data_dir], qr/error: 
invalid argument for option -o/, 'fails with incorrect -o option');
+command_fails_like([ 'pg_resetwal', '-o', '0', $node->data_dir], qr/must not 
be 0/, 'fails with -o value 0');
+# -O
+command_fails_like([ 'pg_resetwal', '-O', 'foo', $node->data_dir], qr/error: 
invalid argument for option -O/, 'fails with incorrect -O option');
+command_fails_like([ 'pg_resetwal', '-O', '-1', $node->data_dir], qr/must not 
be -1/, 'fails with -O value -1');
+# --wal-segsize
+command_fails_like([ 'pg_resetwal', '--wal-segsize', 'foo', $node->data_dir], 
qr/error: invalid value/, 'fails with incorrect --wal-segsize option');
+command_fails_like([ 'pg_resetwal', '--wal-segsize', '13', $node->data_dir], 
qr/must be a power/, 'fails with invalid --wal-segsize value');
+# -u
+command_fails_like([ 'pg_resetwal', '-u', 'foo', $node->data_dir], qr/error: 
invalid argument for option -u/, 'fails with incorrect -u option');
+command_fails_like([ 'pg_resetwal', '-u', '1', $node->data_dir], qr/must be 
greater than/, 'fails with -u value too small');
+# -x
+command_fails_like([ 'pg_resetwal', '-x', 'foo', $node->data_dir], qr/error: 
invalid argument for option -x/, 'fails with incorrect -x option');
+command_fails_like([ 'pg_resetwal', '-x', '1', $node->data_dir], qr/must be 
greater than/, 'fails with -x value too small');
+
+# run with control override options
+
+my $out = (run_command([ 'pg_resetwal', '-n', $node->data_dir ]))[0];
+$out =~ /^Database block size: *(\d+)$/m or die;
+my $blcksz = $1;
+
+my @cmd = ('pg_resetwal', '-D', $node->data_dir);
+
+# some not-so-critical hardcoded values
+push @cmd, '-e', 1;
+push @cmd, '-l', '00000001000000320000004B';
+push @cmd, '-o', 100_000;
+push @cmd, '--wal-segsize', 1;
+
+# these use the guidance from the documentation
+
+sub get_slru_files
+{
+       opendir(my $dh, $node->data_dir . '/' . $_[0]) or die $!;
+       my @files = sort grep { /[0-9A-F]+/ } readdir $dh;
+       closedir $dh;
+       return @files;
+}
+
+my (@files, $mult);
+
+@files = get_slru_files('pg_commit_ts');
+$mult = 32 * $blcksz * 4; # FIXME
+# -c argument is "old,new"
+push @cmd,
+  '-c', sprintf("%d,%d",
+         hex($files[0]) == 0 ? 2 : hex($files[0]), hex($files[-1]));
+
+@files = get_slru_files('pg_multixact/offsets');
+$mult = 32 * $blcksz / 4;
+# -m argument is "new,old"
+push @cmd,
+  '-m', sprintf("%d,%d",
+         (hex($files[-1]) + 1) * $mult,
+         hex($files[0]) == 0 ? 1 : hex($files[0] * $mult));
+
+@files = get_slru_files('pg_multixact/members');
+$mult = 32 * int($blcksz/20) * 4;
+push @cmd,
+  '-O', (hex($files[-1]) + 1) * $mult;
+
+@files = get_slru_files('pg_xact');
+$mult = 32 * $blcksz * 4;
+push @cmd,
+  '-u', (hex($files[0]) == 0 ? 3 : hex($files[0]) * $mult),
+  '-x', ((hex($files[-1]) + 1) * $mult);
+
+command_ok([@cmd, '-n'], 'runs with control override options, dry run');
+command_ok(\@cmd, 'runs with control override options');
+command_like([ 'pg_resetwal', '-n', $node->data_dir ], qr/^Latest checkpoint's 
NextOID: *100000$/m, 'spot check that control changes were applied');
+
+$node->start;
+ok(1, 'server started after reset');
+
 done_testing();
diff --git a/src/bin/pg_resetwal/t/002_corrupted.pl 
b/src/bin/pg_resetwal/t/002_corrupted.pl
index 6d19a1efd5..ddbd0556bb 100644
--- a/src/bin/pg_resetwal/t/002_corrupted.pl
+++ b/src/bin/pg_resetwal/t/002_corrupted.pl
@@ -55,4 +55,8 @@
        ],
        'processes zero WAL segment size');
 
+# now try to run it
+command_fails_like([ 'pg_resetwal', $node->data_dir ], qr/not proceeding 
because control file values were guessed/, 'does not run when control file 
values were guessed');
+command_ok([ 'pg_resetwal', '-f', $node->data_dir ], 'runs with force when 
control file values were guessed');
+
 done_testing();

base-commit: 6844d3275ac6b3c35d824f49362d3fe59b30f26b
-- 
2.41.0

Reply via email to