On Fri, Nov 21, 2025 at 12:14 AM Zhijie Hou (Fujitsu)
<[email protected]> wrote:
>
> On Thursday, November 20, 2025 6:44 PM Amit Kapila <[email protected]> 
> wrote:
> >
> > On Thu, Nov 20, 2025 at 4:07 PM Zhijie Hou (Fujitsu) 
> > <[email protected]>
> > wrote:
> > >
> > > On Thursday, November 20, 2025 4:26 PM Vitaly Davydov
> > <[email protected]> wrote:
> > >
> > > >
> > > > Concerning reserve_wal_for_local_slot. It seems it is used in
> > > > synchronization of failover logical slots. For me, it is tricky to
> > > > change restart_lsn of a synced logical slot to RedoRecPtr, because
> > > > it may lead to problems with logical replication using such slot
> > > > after the replica promotion. But it seems it is the architectural
> > > > problem and it is not related to the problems, solved by the patch.
> > >
> > > I think this is not an issue because if we use the redo pointer
> > > instead of the remote restart_lsn as the initial value, the synced
> > > slot won't be marked as sync-ready, so user cannot use it after
> > > promotion (also well documented). This is also the existing behavior
> > > before the patch, e.g., if the required WALs were removed, the oldest
> > > available WAL was used as the initial value, similarly resulting in the 
> > > slot not
> > being sync-ready.
> > >
> >
> > Would it be better to discuss this in a separate thread? Though this is 
> > related
> > to original problem but still in a separate part of code
> > (slotsync) which I think can have a separate fix especially when the fix is 
> > also
> > somewhat different.
> >
> > > >
> > > > The change of lock mode to EXCLUSIVE in
> > > > ReplicationSlotsComputeRequiredLSN may affect the performance when
> > a
> > > > lot of slots are advanced during some small period of time. It may
> > > > affect the walsender performance. It advances the logical or
> > > > physical slots when receive a confirmation from the replica. I
> > > > guess, the slot advancement may be pretty frequent operation.
> > >
> > > Yes, I had the same thought and considered a simple alternative
> > > (similar to your suggestion below): use an exclusive lock only when
> > > updating the slot.restart_lsn during WAL reservation, while continuing
> > > to use a shared lock in the computation function. Additionally, place
> > XLogSetReplicationSlotMinimumLSN() under the lock.
> > > This approach will also help serialize the process.
> > >
> >
> > Can we discuss this as well in a separate thread?
>
> OK, I think it makes sense to start separate threads.
>
> I have split the patches based on the different bugs they
> address and am sharing them here for reference.
>

I'm reviewing the 0001 patch and the problem that can be addressed by
that patch. While the proposed patch addresses the race condition
between a checkpointing and newly created slot, could the same issue
happen between the checkpointing and copying a slot? I'm trying to
understand when we have to acquire ReplicationSlotAllocationLock in an
exclusive mode in the new lock scheme.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


Reply via email to