Re: Deadlock detector fails to activate on a hot standby replica

Xuneng Zhou Mon, 29 Jun 2026 20:15:24 -0700

On Sat, Jun 27, 2026 at 2:11 PM Xuneng Zhou <[email protected]> wrote:
>
> On Fri, Jun 26, 2026 at 5:31 PM Vitaly Davydov <[email protected]> 
> wrote:
> >
> > Hi Fujii-san, Xuneng Zhou,
> >
> >  > 1) It looks that this patch is doing some refactoring around
> >  > LockBufferForCleanup. The part being touched seems flaky itself on
> >  > HEAD, should we fix the issue in the buffer side first? [1]
> >
> > Thank you for mentioning it. It seems, there is a conflict with
> > the another patch due to the introduction of RegisterPinCountWaiter
> > function, that can not be resolved automatically. I'm ok to wait for
> > the conflicting fix to be released. It seems, our patch should be
> > reconsidered after the conflicting patch release.
>
> OK.
>
> >  > 2) buf_state &= ~BM_PIN_COUNT_WAITER in stable versions seems not
> >  > necessary to me as LockBufferForCleanup will handle the clearing for
> >  > us.
> >
> > It seems it was my mistake when I resolved conflicts when cherry-picking
> > the changes from other branches. I removed this line in my recent v 6.1
> > attached patches. Thank you for catching it.
>
> Thanks.
>
> Here are some additional comments after further review:
>
> 3) should we let RegisterPinCountWaiter do less?
>
> Currently, the function does a lot like:
> RegisterPinCountWaiter()
> {
>     check other waiter;
>     maybe unlock;
>     maybe elog(ERROR);
>     register;
>     unlock header;
> }
>
> We need to handle the unlock & elog for two
> callers(BufferIsReadyForCleanup and LockBufferForCleanup) with
> different needs, which could be error-prone sometimes.
>
> + Assert((buf_state & BM_PIN_COUNT_WAITER) == 0 ||
> +    bufHdr->wait_backend_pgprocno == MyProcNumber);
> +
> + if ((buf_state & BM_PIN_COUNT_WAITER) != 0 &&
> + bufHdr->wait_backend_pgprocno != MyProcNumber)
> + {
> + UnlockBufHdr(bufHdr);
> + elog(ERROR, "multiple processes attempting to wait for pincount 1");
> + }
> +
>
> For example, unlocking the header lock only is sufficient for
> BufferIsReadyForCleanup but not for LockBufferForCleanup as we hold
> the extra content lock. For LockBufferForCleanup, the check seems
> duplicated as we already did it before invoking this function. So to
> make this work, we may need to distinguish these two callers.
>
> Another potential issue of doing safe checks here is the order of
> assert and error out:
> In assert build, if the condition is true, then the header lock and
> content lock would not be released properly. So we may need to adjust
> the order.
>
> This leads me to wonder -- would it be better to let this help
> function to do less, like just for registration and let the callers to
> handle others.
>
> 4) the already-expired standby-delay path may not wake new pin holders
>
> If ltime has already passed, the patch sends
> RECOVERY_CONFLICT_BUFFERPIN once before entering the loop, then waits
> inside the new loop. What if another backend pins the buffer after
> that broadcast?
>
> Before the patch, this seems fine because we return to
> BufferIsReadyForCleanup and enter the function
> ResolveRecoveryConflictWithBufferPin to do the re-check of ltime and
> send the conflict signal once more.
>
> With the new inner loop, BufferIsReadyForCleanup() can re-register and
> continue waiting, but no timeout is active and no new cancelling
> request is sent. So recovery can continue waiting for a buffer pin
> even though max_standby_*_delay has already expired.
>
> To fix this, should we add an in-loop check: if ltime != 0 &&
> GetCurrentTimestamp() >= ltime in here:
>
> + if (got_standby_delay_timeout)
> + {
> + SendRecoveryConflictWithBufferPin(RECOVERY_CONFLICT_BUFFERPIN);
> + break;
> + }
>
> 5) The name of BufferIsReadyForCleanup seems broader than the actual protocol.
>
> The function name seems not align with the intended use of it as the
> comment suggested:
>
>  *
>  * This is only for the hot-standby path in LockBufferForCleanup(), via
>  * ResolveRecoveryConflictWithBufferPin(), after ProcWaitForSignal() returns.
>  * The caller must already be registered as the shared buffer's
>  * BM_PIN_COUNT_WAITER.
>
> Would it be better to rename it to something like
> RecheckCleanupLockPinWait or RecheckBufferPinWaiter?


I took another look at the patch. Here is an another thought:

6) We might want to be more elaborative on the commit message

>From personal experience, the message seems a bit hard to follow if
one does not read the related discussion. I like your analysis of the
problem at the start of this thread [1]. Would it be sensible to
summarize and reuse some part of it for the message?

The motivation of this patch is to fix deadlock detector activation,
but the wait-loop change has a slightly broader benefit. It also
improves the standby timeout path indirectly by avoiding needless
timeout teardown and re-entry on unrelated wakeups, and by keeping any
already-fired timeout state until the loop handles it. Should we
mention this as an additional benefit, or even frame the message
broader like:

Fix the buffer-pin recovery-conflict wait loop.

[1] 
https://www.postgresql.org/message-id/44c24dcf-5710-410f-b1b6-d10b315f3d51%40postgrespro.ru



--
Regards,
Xuneng Zhou
HighGo Software Co., Ltd.

Re: Deadlock detector fails to activate on a hot standby replica

Reply via email to