On Tue, Jun 09, 2020 at 09:01:13PM +0300, Alexey Kondratov wrote:
> Yes, there was a ReplicationSlotsComputeRequiredLSN() call inside
> pg_physical_replication_slot_advance() in the v5 of the patch:
>
> But later it has been missed and we have not noticed that.
>
> I think that adding it back as per attached will be enough.

[ scratches head... ]
Indeed, this part gets wrong and we would have to likely rely on a WAL
sender to do this calculation once a new flush location is received,
but that may not happen in some cases.  It feels more natural to do
that in the location where the slot is marked as dirty, and there 
is no need to move around an extra check to see if the slot has
actually been advanced or not.  Or we could just call the routine once
any advancing is attempted?  That would be unnecessary if no advancing
is done.

> > I find it really depressing how much obviously untested stuff gets
> > added in this area.
>
> Prior to this patch pg_replication_slot_advance was not being tested
> at all.
> Unfortunately, added tests appeared to be not enough to cover all
> cases. It
> seems that the whole machinery of WAL holding and trimming is worth
> to be
> tested more thoroughly.

I think that it would be interesting if we had a SQL representation of
the contents of XLogCtlData (wanted that a couple of times).  Now we
are actually limited to use a checkpoint and check that past segments
are getting recycled by looking at the contents of pg_wal.  Doing that
here does not cause the existing tests to be much more expensive as we
only need one extra call to pg_switch_wal(), mostly.  Please see the
attached.
--
Michael
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 1b929a603e..0ad9ebb794 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -419,6 +419,12 @@ pg_physical_replication_slot_advance(XLogRecPtr moveto)
 		 * crash, but this makes the data consistent after a clean shutdown.
 		 */
 		ReplicationSlotMarkDirty();
+
+		/*
+		 * Recompute the minimum LSN across all slots to adjust with the
+		 * advancing done.
+		 */
+		ReplicationSlotsComputeRequiredLSN();
 	}
 
 	return retlsn;
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 0c316c1808..778f11b28b 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 35;
+use Test::More tests => 36;
 
 # Initialize master node
 my $node_master = get_new_node('master');
@@ -364,15 +364,26 @@ my $is_replayed = $node_standby_2->safe_psql('postgres',
 	qq[SELECT 1 FROM replayed WHERE val = $newval]);
 is($is_replayed, qq(1), "standby_2 didn't replay master value $newval");
 
+# Drop any existing slots on the primary, for the follow-up tests.
+$node_master->safe_psql('postgres',
+	"SELECT pg_drop_replication_slot(slot_name) FROM pg_replication_slots;");
+
 # Test physical slot advancing and its durability.  Create a new slot on
 # the primary, not used by any of the standbys. This reserves WAL at creation.
 my $phys_slot = 'phys_slot';
 $node_master->safe_psql('postgres',
 	"SELECT pg_create_physical_replication_slot('$phys_slot', true);");
+# Generate some WAL, and switch to a new segment, used to check that
+# the previous segment is correctly getting recycled as the slot advancing
+# would recompute the minimum LSN calculated across all slots.
+my $segment_removed = $node_master->safe_psql('postgres',
+	'SELECT pg_walfile_name(pg_current_wal_lsn())');
+chomp($segment_removed);
 $node_master->psql(
 	'postgres', "
 	CREATE TABLE tab_phys_slot (a int);
-	INSERT INTO tab_phys_slot VALUES (generate_series(1,10));");
+	INSERT INTO tab_phys_slot VALUES (generate_series(1,10));
+	SELECT pg_switch_wal();");
 my $current_lsn =
   $node_master->safe_psql('postgres', "SELECT pg_current_wal_lsn();");
 chomp($current_lsn);
@@ -392,3 +403,9 @@ my $phys_restart_lsn_post = $node_master->safe_psql('postgres',
 chomp($phys_restart_lsn_post);
 ok( ($phys_restart_lsn_pre cmp $phys_restart_lsn_post) == 0,
 	"physical slot advance persists across restarts");
+
+# Check if the previous segment gets correctly recycled after the
+# server stopped cleanly, causing a shutdown checkpoint to be generated.
+my $master_data = $node_master->data_dir;
+ok(!-f "$master_data/pg_wal/$segment_removed",
+	"WAL segment $segment_removed recycled after physical slot advancing");

Attachment: signature.asc
Description: PGP signature

Reply via email to