Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly

Amit Kapila Thu, 12 Jun 2025 04:25:18 -0700

On Wed, Jun 11, 2025 at 1:44 AM Alexander Korotkov <[email protected]> wrote:
>
> On Mon, Jun 9, 2025 at 7:09 PM Vitaly Davydov <[email protected]> 
> wrote:
> > > I think we can use this approach for HEAD and probably keep the
> > > previous idea for backbranches. Keeping some value in shared_memory
> > > per slot sounds risky to me in terms of introducing new bugs.
> >
> > Not sure, what kind of problems may occur. I propose to allocate in shmem an
> > array of last_saved_restart_lsn like below which is not a part of the public
> > api (see below). It will be allocated and deallocated in shmem the same way 
> > as
> > ReplicationSlotCtlData. I can prepare a patch, if needed.
> >
> > typedef struct ReplicationSlotCtlDataExt {
> >         XLogRecPtr last_saved_restart_lsn[1];
> > } ReplicationSlotCtlDataExt;
>
> This could work, but I think this is not a solution for HEAD anyway.
> In the HEAD, it would be better to keep everything inside the
> ReplicationSlot struct.  In the same time, I don't like idea to have
> different shared memory structs between branches if we can avoid that.
>
> > > Yeah, but with physical slots, it is possible that the slot's xmin
> > > value is pointing to some value, say 700 (after restart), but vacuum
> > > would have removed tuples from transaction IDs greater than 700 as
> > > explained in email [1].
> >
> > I think, we have no xmin problem for physical slots. The xmin values of
> > physical slots are used to process HSF messages. If I correctly understood 
> > what
> > you mean, you are telling about the problem which is solved by hot standby
> > feedback messages. This message is used to disable tuples vacuuming on the
> > primary to avoid delete conflicts on the replica in queries (some queries 
> > may
> > select some tuples which were vacuumed on the primary and deletions are
> > replicated to the standby). If the primary receives a HSF message after slot
> > saving, I believe, it is allowable if autovacuum cleans tuples with xmin 
> > later
> > than the last saved value. If the primary restarts, the older value will be
> > loaded but the replica already confirmed the newer value. Concerning 
> > replica,
> > it is the obligation of the replica to send such HSF xmin that will survive
> > replica's immediate restart.
>
> +1
>


The point is about the general principle of slot's xmin values, which
is that the rows with xid greater than slot's xmin should be available
(or can't be removed by vacuum). But here, such a principle could be
violated after a restart. I don't have a test to show what harm it can
cause, but will try to think/investigate more on it.

> > >> Taking into account these thoughts, I can't see any problems with the 
> > >> alternative
> > >> patch where oldest wal lsn is calculated only in checkpoint.
> > >>
> > >The alternative will needlessly prevent removing WAL segments in some
> > >cases when logical slots are in use.
> >
> > IMHO, I'm not sure, it will significantly impact the wal removal. We remove 
> > WAL
> > segments only in checkpoint. The alternate solution gets the oldest WAL 
> > segment
> > at the beginning of checkpoint, then saves dirty slots to disk, and removes 
> > old
> > WAL segments at the end of checkpoint using the oldest WAL segment obtained 
> > at
> > the beginning of checkpoint. The alternate solution may not be so effective
> > in terms of WAL segments removal, if a logical slot is advanced during
> > checkpoint, but I do not think it is a significant issue. From the other 
> > hand,
> > the alternate solution simplifies the logic of WAL removal, backward 
> > compatible
> > (avoids addition new in-memory states), decreases the number of locks in
> > ReplicationSlotsComputeRequiredLSN - no need to recalculate oldest slots'
> > restart lsn every time when a slot is advanced.
>
> So, my proposal is to commit the attached patchset to the HEAD, and
> commit [1] to the back branches.  Any objections?
>

No objections. I think we can keep discussing if slot's xmin
computation has any issues or not, but you can proceed with the LSN
stuff.

-- 
With Regards,
Amit Kapila.

Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly

Reply via email to