On Thu, Jan 7, 2021 at 12:45 PM Bharath Rupireddy
<bharath.rupireddyforpostg...@gmail.com> wrote:
>
> On Thu, Jan 7, 2021 at 11:32 AM Amul Sul <sula...@gmail.com> wrote:
> > Snip from CreateCheckPoint():
> > --
> > /*
> >  * Acquire CheckpointLock to ensure only one checkpoint happens at a time.
> >  * (This is just pro forma, since in the present system structure there is
> >  * only one process that is allowed to issue checkpoints at any given
> >  * time.)
> >  */
> > LWLockAcquire(CheckpointLock, LW_EXCLUSIVE);
> > --
> >
> > As per this comment, it seems to be that we don't really need this LW lock. 
> > We
> > could have something else instead if we are afraid of having multiple
> > checkpoints at any given time which isn't possible, btw.
>
> Looks like CheckpointLock() can also be called in standalone backends
> (RequestCheckpoint->CreateCheckPoint) along with the checkpointer
> process. Having said that, I'm not quite sure whether it can get
> called at the same time from a backend(may be with checkpoint;
> command) and the checkpointer process.
>
>     /*
>      * If in a standalone backend, just do it ourselves.
>      */
>     if (!IsPostmasterEnvironment)
>     {
>         /*
>          * There's no point in doing slow checkpoints in a standalone backend,
>          * because there's no other backends the checkpoint could disrupt.
>          */
>         CreateCheckPoint(flags | CHECKPOINT_IMMEDIATE);
>
> See the below comment for IsPostmasterEnvironment:
>
> /*
>  * IsPostmasterEnvironment is true in a postmaster process and any postmaster
>  * child process; it is false in a standalone process (bootstrap or
>  * standalone backend).
>

I am not sure I understood your point completely.  Do you mean there could be
multiple bootstrap or standalone backends that could exist at a time and can
perform checkpoint?

> > This CheckpointLock is acquired almost at the start of CreateCheckPoint() 
> > and
> > released at the end. The in-between code can take a while to be complete. 
> > All
> > that time interrupt will be on hold which doesn't seem to be a great idea 
> > but
> > that wasn't problematic until now.  I am having trouble due to this 
> > interrupt
> > hold for a long since I was trying to add code changes where the 
> > checkpointer
> > process is suppose to respond to the system read-write state change request 
> > as
> > soon as possible[1]. That cannot be done if the interrupt is disabled
> > since read-write state change uses the global barrier mechanism to convey 
> > system
> > state changes to other running processes.
> >
> > So my question is, can we completely get rid of the CheckpointLock need in
> > CreateCheckPoint()?
> >
> > Thoughts/Comments/Suggestions?
>
> I don't think we can completely get rid of CheckpointLock in
> CreateCheckPoint given the fact that it can get called from multiple
> processes.
>

How?

> How about serving that ASRO barrier request alone for checkpointer
> process in ProcessInterrupts even though the CheckpointLock is held by
> the checkpointer process? And also while the checkpointing is
> happening, may be we should call CHECK_FOR_INTERRUPTS to see if the
> ASRO barrier has arrived. This may sound bit hacky, but just a
> thought. I'm thinking that in ProcessInterrupts we can get to know the
> checkpointer process type via MyAuxProcType, and whether the lock is
> held or not using CheckpointLock, and then we can handle the ASRO
> global barrier request.
>

I am afraid that this will easily get rejected by the community.  Note that this
is a very crucial code path of the database server. There are other options
too, but don't want to propose them until clarity on the CheckpointLock
necessity.

Regards,
Amul


Reply via email to