On 2020-Jun-17, Fujii Masao wrote: > On 2020/06/17 3:50, Alvaro Herrera wrote:
> So InvalidateObsoleteReplicationSlots() can terminate normal backends. > But do we want to do this? If we want, we should add the note about this > case into the docs? Otherwise the users would be surprised at termination > of backends by max_slot_wal_keep_size. I guess that it's basically rarely > happen, though. Well, if we could distinguish a walsender from a non-walsender process, then maybe it would make sense to leave backends alive. But do we want that? I admit I don't know what would be the reason to have a non-walsender process with an active slot, so I don't have a good opinion on what to do in this case. > > > + /* > > > + * Signal to terminate the process using the replication slot. > > > + * > > > + * Try to signal every 100ms until it succeeds. > > > + */ > > > + if (!killed && kill(active_pid, SIGTERM) == 0) > > > + killed = true; > > > + ConditionVariableTimedSleep(&slot->active_cv, 100, > > > + > > > WAIT_EVENT_REPLICATION_SLOT_DROP); > > > + } while (ReplicationSlotIsActive(slot, NULL)); > > > > Note that here you're signalling only once and then sleeping many times > > in increments of 100ms -- you're not signalling every 100ms as the > > comment claims -- unless the signal fails, but you don't really expect > > that. On the contrary, I'd claim that the logic is reversed: if the > > signal fails, *then* you should stop signalling. > > You mean; in this code path, signaling fails only when the target process > disappears just before signaling. So if it fails, slot->active_pid is > expected to become 0 even without signaling more. Right? I guess kill() can also fail if the PID now belongs to a process owned by a different user. I think we've disregarded very quick reuse of PIDs, so we needn't concern ourselves with it. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services