At Fri, 19 Jun 2020 10:39:58 +0900, Michael Paquier <mich...@paquier.xyz> wrote in > On Fri, Jun 19, 2020 at 10:02:54AM +0900, Kyotaro Horiguchi wrote: > > At Thu, 18 Jun 2020 18:18:37 +0530, Amit Kapila <amit.kapil...@gmail.com> > > wrote in > >> It is a little unclear to me how this or any proposed patch will solve > >> the original problem reported by Fujii-San? Basically, the problem > >> arises because we don't have an interlock between when the checkpoint > >> removes the WAL segment and the view tries to acquire the same. Am, I > >> missing something? > > The proposed patch fetches the computation of the minimum LSN across > all slots before taking ReplicationSlotControlLock so its value gets > more lossy, and potentially older than what the slots actually > include. So it is an attempt to take the safest spot possible.
Minimum LSN (lastRemovedSegNo) is not protected by the lock. That makes no defference. > Honestly, I find a bit silly the design to compute and use the same > minimum LSN value for all the tuples returned by > pg_get_replication_slots, and you can actually get a pretty good I see it as silly. I think I said upthread that it was the distance to the point where the slot loses a segment, and it was rejected but just removing it makes us unable to estimate the distance so it is there. > estimate of that by emulating ReplicationSlotsComputeRequiredLSN() > directly with what pg_replication_slot provides as we have a min() > aggregate for pg_lsn. min(lastRemovedSegNo) is the earliest value. It is enough to read it at the first then use it in all slots. > For these reasons, I think that we should remove for now this > information from the view, and reconsider this part more carefully for > 14~ with a clear definition of how much lossiness we are ready to > accept for the information provided here, if necessary. We could for > example just have a separate SQL function that just grabs this value > (or a more global SQL view for XLogCtl data that includes this data). I think, we need at least one of the "distance" above or min_safe_lsn in anywhere reachable from users. > > I'm not sure, but I don't get the point of blocking WAL segment > > removal until the view is completed. > > We should really not do that anyway for a monitoring view. regards. -- Kyotaro Horiguchi NTT Open Source Software Center