On Thu, Feb 27, 2025 at 7:26 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Fri, Feb 28, 2025 at 5:10 AM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > > > On Thu, Feb 27, 2025 at 12:52 AM Amit Kapila <amit.kapil...@gmail.com> > > wrote: > > > > > > On Thu, Feb 27, 2025 at 10:47 AM Masahiko Sawada <sawada.m...@gmail.com> > > > wrote: > > > > > > > > On Tue, Feb 25, 2025 at 7:33 PM Amit Kapila <amit.kapil...@gmail.com> > > > > wrote: > > > > > > > > > > AFAICU, InvalidateObsoleteReplicationSlots() is not serialized with > > > > > this operation. So, isn't it possible that the source slot exists at > > > > > the later position in ReplicationSlotCtl->replication_slots and the > > > > > loop traversing slots is already ahead from the point where the newly > > > > > copied slot is created? > > > > > > > > Good point. I think it's possible. > > > > > > > > > If so, the newly created slot won't be marked > > > > > as invalid whereas the source slot will be marked as invalid. I agree > > > > > that even in such a case, at a later point, the newly created slot > > > > > will also be marked as invalid. > > > > > > > > The wal_status of the newly created slot would immediately become > > > > 'lost' and the next checkpoint will invalidate it. Do we need to do > > > > something to deal with this case? > > > > > > > > > > + /* Check if source slot became invalidated during the copy operation */ > > > + if (second_slot_contents.data.invalidated != RS_INVAL_NONE) > > > + ereport(ERROR, > > > + errmsg("cannot copy replication slot \"%s\"", > > > + NameStr(*src_name)), > > > + errdetail("The source replication slot was invalidated during the > > > copy operation.")); > > > > > > How about adding a similar test as above for MyReplicationSlot? That > > > should suffice the need because we have already acquired the new slot > > > by this time and invalidation should signal this process before > > > marking the new slot as invalid. > > > > IIUC in the scenario you mentioned, the loop traversing slots already > > passed the position of newly created slot in > > ReplicationSlotCtl->replication_slots array, so > > MyReplicationSlot->data.invalidated is still RS_INVAL_NONE, no? > > > > Right, I don't have a simpler solution for this apart from making it > somehow serialize this operation with > InvalidateObsoleteReplicationSlots(). I don't think we need to go that > far to handle the scenario being discussed.
Agreed. > It is better to add a > comment for this race and why it won't harm. +1 Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com