Hello Hackers, (CC people involved in the earlier discussion) While implementing slot invalidation based on inactive(idle) timeout (see [1]), several general optimizations and improvements were identified.
This thread is a spin-off from [1], intended to address these optimizations separately from the main feature. As suggested in [2], the improvements are divided into two parts: Patch-001: Update the logic to ensure all inactive slots have the same 'inactive_since' time when restoring the slots from disk in RestoreSlotFromDisk() and when updating the synced slots on standby in update_synced_slots_inactive_since(). Patch-002: Raise error for invalid slots while acquiring it in ReplicationSlotAcquire(). Currently, a process can acquire an invalid slot but may eventually error out at later stages. For example, if a process acquires a slot invalidated due to wal_removed, it will later fail in CreateDecodingContext() when trying to access the removed WAL. The idea here is to improve error handling by detecting invalid slots earlier. A new parameter, "error_if_invalid", is introduced in ReplicationSlotAcquire(). If the caller specifies error_if_invalid=true, an error is raised immediately instead of letting the process acquire the invalid slot first and then fail later due to the invalidated slot. The v1 patches are attached, any feedback would be appreciated! [1] https://www.postgresql.org/message-id/flat/calj2acw4aue-_ufqojdwcen-xxolghmvrfnl8snw_tz5nje...@mail.gmail.com [2] https://www.postgresql.org/message-id/CAA4eK1LiDjz%2BF8hEYG0_ux%3DrqwhxnTuWpT-RKNDWaac3w3bWNw%40mail.gmail.com -- Thanks, Nisha
v1-0001-Ensure-same-inactive_since-time-for-all-inactive-.patch
Description: Binary data
v1-0002-Raise-Error-for-Invalid-Slots-in-ReplicationSlotA.patch
Description: Binary data