Hi, here are some review comments for v45-0001 (excluding the test code) ====== doc/src/sgml/config.sgml
1. + Note that the inactive timeout invalidation mechanism is not + applicable for slots on the standby server that are being synced + from primary server (i.e., standby slots having nit - /from primary server/from the primary server/ ====== src/backend/replication/slot.c 2. ReplicationSlotAcquire + errmsg("can no longer get changes from replication slot \"%s\"", + NameStr(s->data.name)), + errdetail("This slot has been invalidated because it was inactive for longer than the amount of time specified by \"%s\".", + "replication_slot_inactive_timeout."))); nit - "replication_slot_inactive_timeout." - should be no period inside that GUC name literal ~~~ 3. ReportSlotInvalidation I didn't understand why there was a hint for: "You might need to increase \"%s\".", "max_slot_wal_keep_size" But you don't have an equivalent hint for timeout invalidation: "You might need to increase \"%s\".", "replication_slot_inactive_timeout" Why aren't these similar cases consistent? ~~~ 4. RestoreSlotFromDisk + /* Use the same inactive_since time for all the slots. */ + if (now == 0) + now = GetCurrentTimestamp(); + Is the deferred assignment really necessary? Why not just unconditionally assign the 'now' just before the for-loop? Or even at the declaration? e.g. The 'replication_slot_inactive_timeout' is measured in seconds so I don't think 'inactive_since' being wrong by a millisecond here will make any difference. ====== src/include/replication/slot.h 5. ReplicationSlotSetInactiveSince +/* + * Set slot's inactive_since property unless it was previously invalidated due + * to inactive timeout. + */ +static inline void +ReplicationSlotSetInactiveSince(ReplicationSlot *s, TimestampTz *now, + bool acquire_lock) +{ + if (acquire_lock) + SpinLockAcquire(&s->mutex); + + if (s->data.invalidated != RS_INVAL_INACTIVE_TIMEOUT) + s->inactive_since = *now; + + if (acquire_lock) + SpinLockRelease(&s->mutex); +} Is the logic correct? What if the slot was already invalid due to some reason other than RS_INVAL_INACTIVE_TIMEOUT? Is an Assert needed? ====== Kind Regards, Peter Smith. Fujitsu Australia
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 27b2285..97b4fb5 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -4582,7 +4582,7 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows <para> Note that the inactive timeout invalidation mechanism is not applicable for slots on the standby server that are being synced - from primary server (i.e., standby slots having + from the primary server (i.e., standby slots having <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield> value <literal>true</literal>). Synced slots are always considered to be inactive because they don't diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index d92b92b..8cc67b4 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -640,7 +640,7 @@ retry: errmsg("can no longer get changes from replication slot \"%s\"", NameStr(s->data.name)), errdetail("This slot has been invalidated because it was inactive for longer than the amount of time specified by \"%s\".", - "replication_slot_inactive_timeout."))); + "replication_slot_inactive_timeout"))); } /*