On Mon, May 10, 2021 at 9:21 PM Robert Haas <robertmh...@gmail.com> wrote: > > On Sun, May 9, 2021 at 1:26 AM Amul Sul <sula...@gmail.com> wrote: > > The state in the control file also gets cleared. Though, after > > clearing in memory the state patch doesn't really do the immediate > > change to the control file, it relies on the next UpdateControlFile() > > to do that. > > But when will that happen? If you're relying on some very nearby code, > that might be OK, but perhaps a comment is in order. If you're just > thinking it's going to happen eventually, I think that's not good > enough. >
Ok. > > Regarding log message I think I have skipped that intentionally, to > > avoid confusing log as "system is now read write" when we do start as > > hot-standby which is not really read-write. > > I think the message should not be phrased that way. In fact, I think > now that we've moved to calling this pg_prohibit_wal() rather than > ALTER SYSTEM READ ONLY, a lot of messages need to be rethought, and > maybe some comments and function names as well. Perhaps something > like: > > system is read only -> WAL is now prohibited > system is read write -> WAL is no longer prohibited > > And then for this particular case, maybe something like: > > clearing WAL prohibition because the system is in archive recovery > Ok, thanks for the suggestions. > > > The second part of this proposal was: > > > > > > "2. Create a new function with a name like XLogAcceptWrites(). Move the > > > following things from StartupXLOG() into that function: (1) the call > > > to UpdateFullPageWrites(), (2) the following block of code that does > > > either CreateEndOfRecoveryRecord() or RequestCheckpoint() or > > > CreateCheckPoint(), (3) the next block of code that runs > > > recovery_end_command, (4) the call to XLogReportParameters(), and (5) > > > the call to CompleteCommitTsInitialization(). Call the new function > > > from the place where we now call XLogReportParameters(). This would > > > mean that (1)-(3) happen later than they do now, which might require > > > some adjustments." > > > > > > Now you moved that code, but you also moved (6) > > > CompleteCommitTsInitialization(), (7) setting the control file to > > > DB_IN_PRODUCTION, (8) setting the state to RECOVERY_STATE_DONE, and > > > (9) requesting a checkpoint if we were just promoted. That's not what > > > was proposed. One result of this is that the server now thinks it's in > > > recovery even after the startup process has exited. > > > RecoveryInProgress() is still returning true everywhere. But that is > > > inconsistent with what Andres and I were recommending in > > > http://postgr.es/m/CA+TgmoZYQN=rcye-ixwnjdvmaoh+7jaqsif3u2k8xqxipba...@mail.gmail.com > > > > Regarding modified approach, I tried to explain that why I did > > this in > > http://postgr.es/m/CAAJ_b96Yb4jaW6oU1bVYEBaf=tq-ql+mmt1exfwvnzer7xr...@mail.gmail.com > > I am not able to understand what problem you are seeing there. If > we're in crash recovery, then nobody can connect to the database, so > there can't be any concurrent activity. If we're in archive recovery, > we now clear the WAL-is-prohibited flag so that we will go read-write > directly at the end of recovery. We can and should refuse any effort > to call pg_prohibit_wal() during recovery. If we reached the end of > crash recovery and are now permitting read-only connections, why would > anyone be able to write WAL before the system has been changed to > read-write? If that can happen, it's a bug, not a reason to change the > design. > > Maybe your concern here is about ordering: the process that is going > to run XLogAcceptWrites() needs to allow xlog writes locally before we > tell other backends that they also can xlog writes; otherwise, some > other records could slip in before UpdateFullPageWrites() and similar > have run, which we probably don't want. But that's why > LocalSetXLogInsertAllowed() was invented, and if it doesn't quite do > what we need in this situation, we should be able to tweak it so it > does. > Yes, we don't want any write slip in before UpdateFullPageWrites(). Recently[1], we have decided to let the Checkpointed process call XLogAcceptWrites() unconditionally. Here problem is that when a backend executes the pg_prohibit_wal(false) function to make the system read-write, the wal prohibited state is set to inprogress(ie. WALPROHIBIT_STATE_GOING_READ_WRITE) and then Checkpointer is signaled. Next, Checkpointer will convey this system change to all existing backends using a global barrier, and after that final wal prohibited state is set to the read-write(i.e. WALPROHIBIT_STATE_READ_WRITE). While Checkpointer is in the progress of conveying this global barrier, any new backend can connect at that time and can write a new record because the inprogress read-write state is equivalent to the final read-write state iff LocalXLogInsertAllowed != 0 for that backend. And, that new record could slip in before or in between records to be written by XLogAcceptWrites(). 1] http://postgr.es/m/CA+TgmoZYQN=rcye-ixwnjdvmaoh+7jaqsif3u2k8xqxipba...@mail.gmail.com Regards, Amul