On Mon, Jun 24, 2024 at 4:01 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> Hi,
>
> On Mon, Jun 17, 2024 at 5:55 PM Bharath Rupireddy
> <bharath.rupireddyforpostg...@gmail.com> wrote:
> >
> > Here are my thoughts on when to do the XID age invalidation. In all
> > the patches sent so far, the XID age invalidation happens in two
> > places - one during the slot acquisition, and another during the
> > checkpoint. As the suggestion is to do it during the vacuum (manual
> > and auto), so that even if the checkpoint isn't happening in the
> > database for whatever reasons, a vacuum command or autovacuum can
> > invalidate the slots whose XID is aged.
> >
> > An idea is to check for XID age based invalidation for all the slots
> > in ComputeXidHorizons() before it reads replication_slot_xmin and
> > replication_slot_catalog_xmin, and obviously before the proc array
> > lock is acquired. A potential problem with this approach is that the
> > invalidation check can become too aggressive as XID horizons are
> > computed from many places.
> >
> > Another idea is to check for XID age based invalidation for all the
> > slots in higher levels than ComputeXidHorizons(), for example in
> > vacuum() which is an entry point for both vacuum command and
> > autovacuum. This approach seems similar to vacuum_failsafe_age GUC
> > which checks each relation for the failsafe age before vacuum gets
> > triggered on it.
>
> I am attaching the patches implementing the idea of invalidating
> replication slots during vacuum when current slot xmin limits
> (procArray->replication_slot_xmin and
> procArray->replication_slot_catalog_xmin) are aged as per the new XID
> age GUC. When either of these limits are aged, there must be at least
> one replication slot that is aged, because the xmin limits, after all,
> are the minimum of xmin or catalog_xmin of all replication slots. In
> this approach, the new XID age GUC will help vacuum when needed,
> because the current slot xmin limits are recalculated after
> invalidating replication slots that are holding xmins for longer than
> the age. The code is placed in vacuum() which is common for both
> vacuum command and autovacuum, and gets executed only once every
> vacuum cycle to not be too aggressive in invalidating.
>
> However, there might be some concerns with this approach like the
> following:
> 1) Adding more code to vacuum might not be acceptable
> 2) What if invalidation of replication slots emits an error, will it
> block vacuum forever? Currently, InvalidateObsoleteReplicationSlots()
> is also called as part of the checkpoint, and emitting ERRORs from
> within is avoided already. Therefore, there is no concern here for
> now.
> 3) What if there are more replication slots to be invalidated, will it
> delay the vacuum? If yes, by how much? <<TODO>>
> 4) Will the invalidation based on just current replication slot xmin
> limits suffice irrespective of vacuum cutoffs? IOW, if the replication
> slots are invalidated but vacuum isn't going to do any work because
> vacuum cutoffs are not yet met? Is the invalidation work wasteful
> here?
> 5) Is it okay to take just one more time the proc array lock to get
> current replication slot xmin limits via
> ProcArrayGetReplicationSlotXmin() once every vacuum cycle? <<TODO>>
> 6) Vacuum command can't be run on the standby in recovery. So, to help
> invalidate replication slots on the standby, I have for now let the
> checkpointer also do the XID age based invalidation. I know
> invalidating both in checkpointer and vacuum may not be a great idea,
> but I'm open to thoughts.
>
> Following are some of the alternative approaches which IMHO don't help
> vacuum when needed:
> a) Let the checkpointer do the XID age based invalidation, and call it
> out in the documentation that if the checkpoint doesn't happen, the
> new GUC doesn't help even if the vacuum is run. This has been the
> approach until v40 patch.
> b) Checkpointer and/or other backends add an autovacuum work item via
> AutoVacuumRequestWork(), and autovacuum when it gets to it will
> invalidate the replication slots. But, what to do for the vacuum
> command here?
>
> Please find the attached v41 patches implementing the idea of vacuum
> doing the invalidation.
>
> Thoughts?
>
> Thanks to Sawada-san for a detailed off-list discussion.
>

The patch no longer applies on HEAD, please rebase.

regards,
Ajin Cherian
Fujitsu Australia

Reply via email to