On Thu, Jun 18, 2020 at 7:54 AM Robert Haas <robertmh...@gmail.com> wrote: > I think we'd want the FIRST write operation to be the end-of-recovery > checkpoint, before the system is fully read-write. And then after that > completes you could do other things.
I can't see why this is necessary from a correctness or performance point of view. Maybe I'm missing something. In case it is necessary, the patch set does not wait for the checkpoint to complete before marking the system as read-write. Refer: /* Set final state by clearing in-progress flag bit */ if (SetWALProhibitState(wal_state & ~(WALPROHIBIT_TRANSITION_IN_PROGRESS))) { if ((wal_state & WALPROHIBIT_STATE_READ_ONLY) != 0) ereport(LOG, (errmsg("system is now read only"))); else { /* Request checkpoint */ RequestCheckpoint(CHECKPOINT_IMMEDIATE); ereport(LOG, (errmsg("system is now read write"))); } } We should RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT) before we SetWALProhibitState() and do the ereport(), if we have a read-write state change request. Also, we currently request this checkpoint even if there was no startup recovery and we don't set CHECKPOINT_END_OF_RECOVERY in the case where the read-write request does follow a startup recovery. So it should really be: RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT | CHECKPOINT_END_OF_RECOVERY); We would need to convey that an end-of-recovery-checkpoint is pending in shmem somehow (and only if one such checkpoint is pending, should we do it as a part of the read-write request handling). Maybe we can set CHECKPOINT_END_OF_RECOVERY in ckpt_flags where we do: /* * Skip end-of-recovery checkpoint if the system is in WAL prohibited state. */ and then check for that. Some minor comments about the code (some of them probably doesn't warrant immediate attention, but for the record...): 1. There are some places where we can use a local variable to store the result of RelationNeedsWAL() to avoid repeated calls to it. E.g. brin_doupdate() 2. Similarly, we can also capture the calls to GetWALProhibitState() in a local variable where applicable. E.g. inside WALProhibitRequest(). 3. Some of the functions that were added such as GetWALProhibitState(), IsWALProhibited() etc could be declared static inline. 4. IsWALProhibited(): Shouldn't it really be: bool IsWALProhibited(void) { uint32 walProhibitState = GetWALProhibitState(); return (walProhibitState & WALPROHIBIT_STATE_READ_ONLY) != 0 && (walProhibitState & WALPROHIBIT_TRANSITION_IN_PROGRESS) == 0; } 5. I think the comments: /* Must be performing an INSERT or UPDATE, so we'll have an XID */ and /* Can reach here from VACUUM, so need not have an XID */ can be internalized in the function/macro comment header. 6. Typo: ConditionVariable readonly_cv; /* signaled when ckpt_started advances */ We need to update the comment here. Regards, Soumyadeep (VMware)