On Mon, 5 Jan 2026 at 15:01, Amit Kapila <[email protected]> wrote: > > On Tue, Dec 30, 2025 at 5:31 AM Masahiko Sawada <[email protected]> wrote: > > > > On Sun, Dec 14, 2025 at 8:21 PM Amit Kapila <[email protected]> wrote: > > > > > > On Thu, Dec 11, 2025 at 12:39 PM Zhijie Hou (Fujitsu) > > > <[email protected]> wrote: > > > > > > > > > > > > > > The other idea to fix this problem is suggested by Alexander in his > > > > > email [1] which is to introduce a new ReplicationSlotReserveWALLock > > > > > for this purpose. I think introducing LWLock in back branches could be > > > > > questionable. Did you evaluate the pros and cons of using that > > > > > approach? > > > > > > > > I reviewed that approach, and I think the main distinction lies in > > > > whether to > > > > use a new LWLock to serialize the process or rely on an existing lock. > > > > Introducing a new LWLock in back branches would alter the size of > > > > MainLWLockArray and affect > > > > NUM_INDIVIDUAL_LWLOCKS/LWTRANCHE_FIRST_USER_DEFINED. > > > > Although this may not directly impact user applications since users > > > > typically > > > > use standard APIs like RequestNamedLWLockTranche and LWLockNewTrancheId > > > > to add > > > > private LWLocks, it still has a slight risk. Additionally, using an > > > > existing > > > > lock could keep code similarity with the HEAD, which can be helpful for > > > > future > > > > bug fixes and analysis. > > > > > > > > > > Fair enough. I'll wait for Sawada-san/Vitaly to see what their opinion > > > on this matter is. > > > > While it's hacky that the proposed approach takes > > ReplicationSlotAllocationLock before > > XLogGetReplicationSlotMinimumLSN() during checkpoint, I find that it's > > better than introducing a new LWLock in minor versions which could > > lead unexpected compatibility issues. > > > > Regarding the v10 patch, do we need to take > > ReplicationSlotAllocationLock also at the following place? > > > > /* > > * Recalculate the current minimum LSN to be used in the WAL segment > > * cleanup. Then, we must synchronize the replication slots again > > in > > * order to make this LSN safe to use. > > */ > > slotsMinReqLSN = XLogGetReplicationSlotMinimumLSN(); > > CheckPointReplicationSlots(shutdown); > > > > No, we don't need the allocation lock here because RedoRecPtr won't > change after the previous computation so the WAL reservation during > creation of slots won't be impacted. I mean if the slot reservation > starts just before this computation, it should use the latest (this > checkpoint cycle's RedoRecPtr) whereas that was not true with the case > where the patch acquires it. > > > I think we need to add some comments regardless of taking the lwlock. > > > > I have added a comment though not sure if it is required.
Here is a version which includes back branch version patches with
pgindent changes. I have verified the following scenario in PG17,
PG16, PG15 and PG14 branches and works as expected with the patches:
1. Start a backend to create a slot (s) but block it before updating
the slot.restart_lsn in ReplicationSlotReserveWal():
select pg_create_logical_replication_slot('s', 'test_decoding');
2. start another backend to generate some new WAL files.
select pg_switch_wal();
select pg_switch_wal();
select pg_switch_wal();
3. execute checkpoint but block it before calling
InvalidateObsoleteReplicationSlots()
4. Release the backend to create slot (s).
5. release the checkpoint. The slot will be invalidated.
The v12_PG15 patch can be used for the PG14 branch too.
Regards,
Vignesh
v12_PG17-0001-Prevent-invalidation-of-newly-created-repli.patch
Description: Binary data
v12_PG15-0001-Prevent-invalidation-of-newly-created-repli.patch
Description: Binary data
v12_PG16-0001-Prevent-invalidation-of-newly-created-repli.patch
Description: Binary data
