On Sat, Sep 12, 2020 at 1:23 AM Amul Sul <sula...@gmail.com> wrote: > > So, if we're in the middle of a paced checkpoint with a large > > checkpoint_timeout - a sensible real world configuration - we'll not > > process ASRO until that checkpoint is over? That seems very much not > > practical. What am I missing? > > Yes, the process doing ASRO will wait until that checkpoint is over.
That's not good. On a typical busy system, a system is going to be in the middle of a checkpoint most of the time, and the checkpoint will take a long time to finish - maybe minutes. We want this feature to respond within milliseconds or a few seconds, not minutes. So we need something better here. I'm inclined to think that we should try to CompleteWALProhibitChange() at the same places we AbsorbSyncRequests(). We know from experience that bad things happen if we fail to absorb sync requests in a timely fashion, so we probably have enough calls to AbsorbSyncRequests() to make sure that we always do that work in a timely fashion. So, if we do this work in the same place, then it will also be done in a timely fashion. I'm not 100% sure whether that introduces any other problems. Certainly, we're not going to be able to finish the checkpoint once we've gone read-only, so we'll fail when we try to write the WAL record for that, or maybe earlier if there's anything else that tries to write WAL. Either the checkpoint needs to error out, like any other attempt to write WAL, and we can attempt a new checkpoint if and when we go read/write, or else we need to finish writing stuff out to disk but not actually write the checkpoint completion record (or any other WAL) unless and until the system goes back into read/write mode - and then at that point the previously-started checkpoint will finish normally. The latter seems better if we can make it work, but the former is probably also acceptable. What you've got right now is not. -- Robert Haas EDB: http://www.enterprisedb.com