Re: Introduce XID age and inactive timeout based replication slot invalidation

Peter Smith Mon, 09 Sep 2024 00:41:40 -0700

Hi, here are some review comments for v45-0001 (excluding the test code)

======
doc/src/sgml/config.sgml


1.
+        Note that the inactive timeout invalidation mechanism is not
+        applicable for slots on the standby server that are being synced
+        from primary server (i.e., standby slots having

nit - /from primary server/from the primary server/

======
src/backend/replication/slot.c

2. ReplicationSlotAcquire

+ errmsg("can no longer get changes from replication slot \"%s\"",
+ NameStr(s->data.name)),
+ errdetail("This slot has been invalidated because it was inactive
for longer than the amount of time specified by \"%s\".",
+    "replication_slot_inactive_timeout.")));

nit - "replication_slot_inactive_timeout." - should be no period
inside that GUC name literal

~~~

3. ReportSlotInvalidation

I didn't understand why there was a hint for:
"You might need to increase \"%s\".", "max_slot_wal_keep_size"

But you don't have an equivalent hint for timeout invalidation:
"You might need to increase \"%s\".", "replication_slot_inactive_timeout"

Why aren't these similar cases consistent?

~~~

4. RestoreSlotFromDisk

+ /* Use the same inactive_since time for all the slots. */
+ if (now == 0)
+ now = GetCurrentTimestamp();
+

Is the deferred assignment really necessary? Why not just
unconditionally assign the 'now' just before the for-loop? Or even at
the declaration? e.g. The 'replication_slot_inactive_timeout' is
measured in seconds so I don't think 'inactive_since' being wrong by a
millisecond here will make any difference.

======
src/include/replication/slot.h

5. ReplicationSlotSetInactiveSince

+/*
+ * Set slot's inactive_since property unless it was previously invalidated due
+ * to inactive timeout.
+ */
+static inline void
+ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz *now,
+ bool acquire_lock)
+{
+ if (acquire_lock)
+ SpinLockAcquire(&s->mutex);
+
+ if (s->data.invalidated != RS_INVAL_INACTIVE_TIMEOUT)
+ s->inactive_since = *now;
+
+ if (acquire_lock)
+ SpinLockRelease(&s->mutex);
+}

Is the logic correct? What if the slot was already invalid due to some
reason other than RS_INVAL_INACTIVE_TIMEOUT? Is an Assert needed?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 27b2285..97b4fb5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4582,7 +4582,7 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" 
"%p"'  # Windows
        <para>
         Note that the inactive timeout invalidation mechanism is not
         applicable for slots on the standby server that are being synced
-        from primary server (i.e., standby slots having
+        from the primary server (i.e., standby slots having
         <link 
linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
         value <literal>true</literal>).
         Synced slots are always considered to be inactive because they don't
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index d92b92b..8cc67b4 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -640,7 +640,7 @@ retry:
                                 errmsg("can no longer get changes from 
replication slot \"%s\"",
                                                NameStr(s->data.name)),
                                 errdetail("This slot has been invalidated 
because it was inactive for longer than the amount of time specified by 
\"%s\".",
-                                                  
"replication_slot_inactive_timeout.")));
+                                                  
"replication_slot_inactive_timeout")));
        }
 
        /*

Re: Introduce XID age and inactive timeout based replication slot invalidation

Reply via email to