Hi, On 2021-03-23 23:37:14 +0900, Masahiko Sawada wrote: > On Tue, Mar 23, 2021 at 3:09 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Mon, Mar 22, 2021 at 12:20 PM Masahiko Sawada <sawada.m...@gmail.com> > > wrote: > > > > > > On Mon, Mar 22, 2021 at 1:25 PM Masahiko Sawada <sawada.m...@gmail.com> > > > wrote: > > > > > > > > On Sat, Mar 20, 2021 at 3:52 AM Andres Freund <and...@anarazel.de> > > > > wrote: > > > > > > > > > > - If max_replication_slots was lowered between a restart, > > > > > pgstat_read_statfile() will happily write beyond the end of > > > > > replSlotStats. > > > > > > > > I think we cannot restart the server after lowering > > > > max_replication_slots to a value less than the number of replication > > > > slots actually created on the server. No? > > > > > > This problem happens in the case where max_replication_slots is > > > lowered and there still are stats for a slot. > > > > > > > I think this can happen only if the drop message is lost, right? > > Yes, I think you're right. In that case, the stats file could have > more slots statistics than the lowered max_replication_slots.
Or if slots are deleted on the file-system while the cluster is shutdown. Which obviously is at best a semi-supported thing, but it normally does work. > > > I understood the risk of running out of replSlotStats. If we use the > > > index in replSlotStats instead, IIUC we need to somehow synchronize > > > the indexes in between replSlotStats and > > > ReplicationSlotCtl->replication_slots. The order of replSlotStats is > > > preserved across restarting whereas the order of > > > ReplicationSlotCtl->replication_slots isn’t (readdir() that is used by > > > StartupReplicationSlots() doesn’t guarantee the order of the returneda > > > entries in the directory). Very good point. Even if readdir() order were fixed, we'd still have the problem because there can be "gaps" in the indexes for slots (e.g. create slot_a, create slot_b, create slot_c, drop slot_b, leaving you with index 0 and 2 used, and 1 unused). > > > Maybe we can compare the slot name in the > > > received message to the name in the element of replSlotStats. If they > > > don’t match, we swap entries in replSlotStats to synchronize the index > > > of the replication slot in ReplicationSlotCtl->replication_slots and > > > replSlotStats. If we cannot find the entry in replSlotStats that has > > > the name in the received message, it probably means either it's a new > > > slot or the previous create message is dropped, we can create the new > > > stats for the slot. Is that what you mean, Andres? That doesn't seem great. Slot names are imo a poor identifier for something happening asynchronously. The stats collector regularly doesn't process incoming messages for periods of time because it is busy writing out the stats file. That's also when messages to it are most likely to be dropped (likely because the incoming buffer is full). Perhaps we could have RestoreSlotFromDisk() send something to the stats collector ensuring the mapping makes sense? Greetings, Andres Freund