On Mon, Apr 27, 2026 at 11:00 AM Alexander Lakhin <[email protected]> wrote: > > Hello Sawada-san, > > 24.04.2026 20:52, Masahiko Sawada wrote: > > Right. The postmaster blocks all signals before starting child process > as the following comment explains: > > /* > * We start postmaster children with signals blocked. This allows them > to > * install their own handlers before unblocking, to avoid races where > they > * might run the postmaster's handler and miss an important control > * signal. With more analysis this could potentially be relaxed. > */ > sigprocmask(SIG_SETMASK, &BlockSig, &save_mask); > > Investigating the issue, I found there is a race condition between the > procsignal initialization and emitting signal barrier that could be > the cause of this issue. Imagine the following scenario: > > 1. In ProcSignalInit(), the checkpointer initializes its > slot->pss_barrierGeneration with the global generation. > 2. In EmitProcSignalBarrier(), the startup checks the checkpointer's > procsignal slot but it skips emitting the signal as slot->pss_pid is > still 0. It can happen even though the checkpointer holds a spinlock > on its slot during the initialization because the first pid check is > done without a spinlock acquisition. > 3. The checkpointer sets its pid to slot->pss_pid and releases the spin lock. > 4. In WaitForProcSignalBarrier(), the startup checks the > checkpointer's procsignal slot that has already initialized the > pss_barrierGeneration, and waits for it to be updated. However, the > checkpointer never updates its barrier generation as it doesn't get > the signal. > > > Thank you for the investigation and explanation of the issue! > > I've been puzzled by a buildfarm failure [1] with such symptoms for a while > and even reproduced it locally once, but couldn't gather more information > that time. But now that you have described the scenario, I can easily > reproduce the same test failure with: > --- a/src/backend/storage/ipc/procsignal.c > +++ b/src/backend/storage/ipc/procsignal.c > @@ -206,6 +206,7 @@ ProcSignalInit(const uint8 *cancel_key, int > cancel_key_len) > if (cancel_key_len > 0) > memcpy(slot->pss_cancel_key, cancel_key, cancel_key_len); > slot->pss_cancel_key_len = cancel_key_len; > +pg_usleep(10000); > pg_atomic_write_u32(&slot->pss_pid, MyProcPid);
Thank you for testing this. I've attached a patch to address the issue. I haven't verified it across all versions yet, but I suspect it exists in the stable branches as well. Previously, the issue rarely occurred because EmitProcSignalBarrier() was only used for smgr invalidation. However, now that we use signal barriers for online wal_level changes and checksum status updates, this race condition is likely to be encountered more frequently. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
From 8ed72e1bc748f99fbf8b103ae5bd4cf395cb54ef Mon Sep 17 00:00:00 2001 From: Masahiko Sawada <[email protected]> Date: Tue, 28 Apr 2026 12:21:21 -0700 Subject: [PATCH v1] Fix race between ProcSignalInit() and EmitProcSignalBarrier(). ProcSignalInit() read the global barrier generation before publishing its PID into pss_pid. A concurrent EmitProcSignalBarrier() iterates the ProcSignal slots and skips any whose pss_pid is still zero, on the assumption that such a slot will pick up the new generation when it later reads psh_barrierGeneration. But because the joining backend had already read the (older) global generation under its slot's spinlock, it would store a stale value into pss_barrierGeneration and never absorb the just-emitted barrier, resulting that WaitForProcSignalBarrier() didn't complete. Publish pss_pid before reading psh_barrierGeneration, with a memory barrier in between so that the store is globally visible first. A concurrent EmitProcSignalBarrier() then either observes the published PID and signals this slot, or completes its generation increment before we load it. Discussion: https://postgr.es/m/caeze2wgajmwredn7chtba8er2ybvkcoa0kvn25-1evntrhs...@mail.gmail.com Backpatch-through: --- src/backend/storage/ipc/procsignal.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c index 264e4c22ca6..b0681ca0ae2 100644 --- a/src/backend/storage/ipc/procsignal.c +++ b/src/backend/storage/ipc/procsignal.c @@ -188,6 +188,16 @@ ProcSignalInit(const uint8 *cancel_key, int cancel_key_len) /* Clear out any leftover signal reasons */ MemSet(slot->pss_signalFlags, 0, NUM_PROCSIGNALS * sizeof(sig_atomic_t)); + /* + * Publish the PID before reading the global barrier generation to ensure + * that EmitProcSignalBarrier() doesn't skip us while we are grabbing an + * older generation. We need a memory barrier here to make sure that the + * update of pss_pid is globally visible before the load of the global + * barrier generation executes. + */ + pg_atomic_write_u32(&slot->pss_pid, MyProcPid); + pg_memory_barrier(); + /* * Initialize barrier state. Since we're a brand-new process, there * shouldn't be any leftover backend-private state that needs to be @@ -207,7 +217,6 @@ ProcSignalInit(const uint8 *cancel_key, int cancel_key_len) if (cancel_key_len > 0) memcpy(slot->pss_cancel_key, cancel_key, cancel_key_len); slot->pss_cancel_key_len = cancel_key_len; - pg_atomic_write_u32(&slot->pss_pid, MyProcPid); SpinLockRelease(&slot->pss_mutex); -- 2.54.0
