Re: TAP test for recovery_end_command

Michael Paquier Tue, 26 Oct 2021 21:08:18 -0700

On Mon, Oct 25, 2021 at 02:42:28PM +0530, Amul Sul wrote:
> Understood, moved tests to 002_archiving.pl in the attached version.


Thanks for the new patch.  I have reviewed its contents, and there
were a couple of things that caught my attention while putting my
hands on it.

+$node_standby->append_conf('postgresql.conf',
+       "archive_cleanup_command = 'echo archive_cleanuped > 
$archive_cleanup_command_file'");
+$node_standby->append_conf('postgresql.conf',
+       "recovery_end_command = 'echo recovery_ended > 
$recovery_end_command_file'");
This can be formatted with a single append_conf() call and qq() to
have the correct set of quotes.

+$node_primary->safe_psql('postgres', "CHECKPOINT");
 my $current_lsn =
   $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn();");
This had better say that the checkpoint is necessary because we need
one before switching to a new segment on the primary, as much as the
checkpoint on the first standby is needed to trigger the command whose
execution is checked.

+$node_standby2->append_conf('postgresql.conf',
+       "archive_cleanup_command = 'echo xyz > 
$data_dir/unexisting_dir/xyz.file'");
+$node_standby2->append_conf('postgresql.conf',
+       "recovery_end_command = 'echo xyz > 
$data_dir/unexisting_dir/xyz.file'");
[...]
+# Failing to execute archive_cleanup_command and/or recovery_end_command does
+# not affect promotion.
+is($node_standby2->safe_psql( 'postgres', q{SELECT pg_is_in_recovery()}), 'f',
+       "standby promoted successfully despite incorrect 
archive_cleanup_command and recovery_end_command");

This SQL test is mostly useless IMO, as the promote() call done above
ensures that this state is reached properly, and the same thing could
be with the removals of RECOVERYHISTORY and RECOVERYXLOG.  I think
that it would be better to check directly if the commands are run or
not.  This is simple to test: look at the logs from a position just
before the promotion, slurp the log file of $standby2 from this
position, and finally compare its contents with a regex of your
choice.  I have chosen a simple "qr/WARNING:.*recovery_end_command/s"
for the purpose of this test.  Having a test for
archive_cleanup_command here would be nice, but that would be much
more costly than the end-of-recovery command, so I have left that
out.  Perhaps we could just append it in the conf as a dummy, as you
did, though, but its execution is not deterministic in this test so we
are better without for now IMO.

perltidy was also complaining a bit, this is fixed as of the attached.
I have checked things on my own Windows dev box, while on it.
--
Michael

From 01629266d79bca20b3b653ce8c59062ef7187515 Mon Sep 17 00:00:00 2001
From: Michael Paquier <mich...@paquier.xyz>
Date: Wed, 27 Oct 2021 12:59:13 +0900
Subject: [PATCH v6] TAP test for recovery_end_command and
 archive_cleanup_command

---
 src/test/recovery/t/002_archiving.pl | 47 +++++++++++++++++++++++++++-
 1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/src/test/recovery/t/002_archiving.pl b/src/test/recovery/t/002_archiving.pl
index 24852c97fd..7dc2f4e4c8 100644
--- a/src/test/recovery/t/002_archiving.pl
+++ b/src/test/recovery/t/002_archiving.pl
@@ -6,7 +6,7 @@ use strict;
 use warnings;
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
-use Test::More tests => 3;
+use Test::More tests => 7;
 use File::Copy;
 
 # Initialize primary node, doing archives
@@ -28,6 +28,17 @@ $node_standby->init_from_backup($node_primary, $backup_name,
 	has_restoring => 1);
 $node_standby->append_conf('postgresql.conf',
 	"wal_retrieve_retry_interval = '100ms'");
+
+# Set archive_cleanup_command and recovery_end_command, checking their
+# execution by the backend with dummy commands.
+my $data_dir                     = $node_standby->data_dir;
+my $archive_cleanup_command_file = "$data_dir/archive_cleanup_command.done";
+my $recovery_end_command_file    = "$data_dir/recovery_end_command.done";
+$node_standby->append_conf(
+	'postgresql.conf', qq(
+archive_cleanup_command = 'echo archive_cleanup_done > $archive_cleanup_command_file'
+recovery_end_command = 'echo recovery_ended_done > $recovery_end_command_file'
+));
 $node_standby->start;
 
 # Create some content on primary
@@ -36,6 +47,10 @@ $node_primary->safe_psql('postgres',
 my $current_lsn =
   $node_primary->safe_psql('postgres', "SELECT pg_current_wal_lsn();");
 
+# Note the presence of this checkpoint for the archive_cleanup_command
+# check done below, before switching to a new segment.
+$node_primary->safe_psql('postgres', "CHECKPOINT");
+
 # Force archiving of WAL file to make it present on primary
 $node_primary->safe_psql('postgres', "SELECT pg_switch_wal()");
 
@@ -53,6 +68,14 @@ my $result =
   $node_standby->safe_psql('postgres', "SELECT count(*) FROM tab_int");
 is($result, qq(1000), 'check content from archives');
 
+# archive_cleanup_command is executed after generating a restart point,
+# with a checkpoint.
+$node_standby->safe_psql('postgres', q{CHECKPOINT});
+ok(-f $archive_cleanup_command_file,
+	'archive_cleanup_command executed on checkpoint');
+ok(!-f $recovery_end_command_file,
+	'recovery_end_command executed after promotion');
+
 # Check the presence of temporary files specifically generated during
 # archive recovery.  To ensure the presence of the temporary history
 # file, switch to a timeline large enough to allow a standby to recover
@@ -62,11 +85,26 @@ is($result, qq(1000), 'check content from archives');
 # promoted.
 $node_standby->promote;
 
+# recovery_end_command should have been triggered on promotion.
+ok(-f $recovery_end_command_file,
+	'recovery_end_command executed after promotion');
+
 my $node_standby2 = PostgreSQL::Test::Cluster->new('standby2');
 $node_standby2->init_from_backup($node_primary, $backup_name,
 	has_restoring => 1);
+
+# Check that failures when executing and recovery_end_command at
+# promotion has no effect, but its failure should be logged.
+$node_standby2->append_conf(
+	'postgresql.conf', qq(
+recovery_end_command = 'echo recovery_end_failed > $data_dir/missing_dir/xyz.file'
+));
+
 $node_standby2->start;
 
+# Save the log location, to see the failure of recovery_end_command.
+my $log_location = -s $node_standby2->logfile;
+
 # Now promote standby2, and check that temporary files specifically
 # generated during archive recovery are removed by the end of recovery.
 $node_standby2->promote;
@@ -75,3 +113,10 @@ ok( !-f "$node_standby2_data/pg_wal/RECOVERYHISTORY",
 	"RECOVERYHISTORY removed after promotion");
 ok( !-f "$node_standby2_data/pg_wal/RECOVERYXLOG",
 	"RECOVERYXLOG removed after promotion");
+
+# Check the logs of the standby to see that the commands have failed.
+my $log_contents = slurp_file($node_standby2->logfile, $log_location);
+like(
+	$log_contents,
+	qr/WARNING:.*recovery_end_command/s,
+	"recovery_end_command failure detected in logs after promotion");
-- 
2.33.0

signature.asc
Description: PGP signature

Re: TAP test for recovery_end_command

Reply via email to