On Fri, Jul 24, 2020 at 6:28 AM Soumyadeep Chakraborty < soumyadeep2...@gmail.com> wrote: > > On Thu, Jun 18, 2020 at 7:54 AM Robert Haas <robertmh...@gmail.com> wrote: > > I think we'd want the FIRST write operation to be the end-of-recovery > > checkpoint, before the system is fully read-write. And then after that > > completes you could do other things. > > I can't see why this is necessary from a correctness or performance > point of view. Maybe I'm missing something. > > In case it is necessary, the patch set does not wait for the checkpoint to > complete before marking the system as read-write. Refer: > > /* Set final state by clearing in-progress flag bit */ > if (SetWALProhibitState(wal_state & ~(WALPROHIBIT_TRANSITION_IN_PROGRESS))) > { > if ((wal_state & WALPROHIBIT_STATE_READ_ONLY) != 0) > ereport(LOG, (errmsg("system is now read only"))); > else > { > /* Request checkpoint */ > RequestCheckpoint(CHECKPOINT_IMMEDIATE); > ereport(LOG, (errmsg("system is now read write"))); > } > } > > We should RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT) before > we SetWALProhibitState() and do the ereport(), if we have a read-write > state change request. > +1, I too have the same question.
FWIW, I don't we can request CHECKPOINT_WAIT for this place, otherwise, it think it will be deadlock case -- checkpointer process waiting for itself. > Also, we currently request this checkpoint even if there was no startup > recovery and we don't set CHECKPOINT_END_OF_RECOVERY in the case where > the read-write request does follow a startup recovery. > So it should really be: > RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT | > CHECKPOINT_END_OF_RECOVERY); > We would need to convey that an end-of-recovery-checkpoint is pending in > shmem somehow (and only if one such checkpoint is pending, should we do > it as a part of the read-write request handling). > Maybe we can set CHECKPOINT_END_OF_RECOVERY in ckpt_flags where we do: > /* > * Skip end-of-recovery checkpoint if the system is in WAL prohibited state. > */ > and then check for that. > Yep, we need some indication that end-of-recovery was skipped at the startup, but I haven't added that since I wasn't sure do we really need CHECKPOINT_END_OF_RECOVERY as part of the previous concern? > Some minor comments about the code (some of them probably doesn't > warrant immediate attention, but for the record...): > > 1. There are some places where we can use a local variable to store the > result of RelationNeedsWAL() to avoid repeated calls to it. E.g. > brin_doupdate() > Ok. > 2. Similarly, we can also capture the calls to GetWALProhibitState() in > a local variable where applicable. E.g. inside WALProhibitRequest(). > I don't think so. > 3. Some of the functions that were added such as GetWALProhibitState(), > IsWALProhibited() etc could be declared static inline. > IsWALProhibited() can be static but not GetWALProhibitState() since it needed to be accessible from other files. > 4. IsWALProhibited(): Shouldn't it really be: > bool > IsWALProhibited(void) > { > uint32 walProhibitState = GetWALProhibitState(); > return (walProhibitState & WALPROHIBIT_STATE_READ_ONLY) != 0 > && (walProhibitState & WALPROHIBIT_TRANSITION_IN_PROGRESS) == 0; > } > I think the current one is better, this allows read-write transactions from existing backend which has absorbed barrier or from new backend while we changing stated to read-write in the assumption that we never fallback. > 5. I think the comments: > /* Must be performing an INSERT or UPDATE, so we'll have an XID */ > and > /* Can reach here from VACUUM, so need not have an XID */ > can be internalized in the function/macro comment header. > Ok. > 6. Typo: ConditionVariable readonly_cv; /* signaled when ckpt_started > advances */ > We need to update the comment here. > Ok. Will try to address all the above review comments in the next version along with Andres' concern/suggestion. Thanks again for your time. Regards, Amul