On Sun, Sep 8, 2024 at 5:25 PM Bharath Rupireddy <bharath.rupireddyforpostg...@gmail.com> wrote: > > > Please find the v45 patch. Addressed above and Shveta's review comments [1]. >
Thanks for the patch. Please find my comments: 1) src/sgml/config.sgml: + Synced slots are always considered to be inactive because they don't perform logical decoding to produce changes. It is better we avoid such a statement, as internally we use logical decoding to advance restart-lsn, see 'LogicalSlotAdvanceAndCheckSnapState' called form slotsync.c. <Also see related comment 6 below> 2) src/sgml/config.sgml: + disables the inactive timeout invalidation mechanism + Slot invalidation due to inactivity timeout occurs during checkpoint. Either have 'inactive' at both the places or 'inactivity'. 3) slot.c: +static bool InvalidatePossiblyObsoleteSlot(ReplicationSlotInvalidationCause cause, + ReplicationSlot *s, + XLogRecPtr oldestLSN, + Oid dboid, + TransactionId snapshotConflictHorizon, + bool *invalidated); +static inline bool SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s); I think, we do not need above 2 declarations. The code compile fine without these as the usage is later than the definition. 4) + /* + * An error is raised if error_if_invalid is true and the slot has been + * invalidated previously. + */ + if (error_if_invalid && s->data.invalidated == RS_INVAL_INACTIVE_TIMEOUT) The comment is generic while the 'if condition' is specific to one invalidation cause. Even though I feel it can be made generic test for all invalidation causes but that is not under scope of this thread and needs more testing/analysis. For the time being, we can make comment specific to the concerned invalidation cause. The header of function will also need the same change. 5) SlotInactiveTimeoutCheckAllowed(): + * Check if inactive timeout invalidation mechanism is disabled or slot is + * currently being used or server is in recovery mode or slot on standby is + * currently being synced from the primary. + * These comments say exact opposite of what we are checking in code. Since the function name has 'Allowed' in it, we should be putting comments which say what allows it instead of what disallows it. 6) + * Synced slots are always considered to be inactive because they don't + * perform logical decoding to produce changes. + */ +static inline bool +SlotInactiveTimeoutCheckAllowed(ReplicationSlot *s) Perhaps we should avoid mentioning logical decoding here. When slots are synced, they are performing decoding and their inactive_since is changing continuously. A better way to make this statement will be: We want to ensure that the slots being synchronized are not invalidated, as they need to be preserved for future use when the standby server is promoted to the primary. This is necessary for resuming logical replication from the new primary server. <Rephrase if needed> 7) InvalidatePossiblyObsoleteSlot() we are calling SlotInactiveTimeoutCheckAllowed() twice in this function. We shall optimize. At the first usage place, shall we simply get timestamp when cause is RS_INVAL_INACTIVE_TIMEOUT without checking SlotInactiveTimeoutCheckAllowed() as IMO it does not seem a performance critical section. Or if we retain check at first place, then at the second place we can avoid calling it again based on whether 'now' is NULL or not. thanks Shveta