On Mon, Apr 27, 2026 at 11:00 AM Alexander Lakhin <[email protected]> wrote:
>
> Hello Sawada-san,
>
> 24.04.2026 20:52, Masahiko Sawada wrote:
>
> Right. The postmaster blocks all signals before starting child process
> as the following comment explains:
>
>      /*
>       * We start postmaster children with signals blocked.  This allows them 
> to
>       * install their own handlers before unblocking, to avoid races where 
> they
>       * might run the postmaster's handler and miss an important control
>       * signal. With more analysis this could potentially be relaxed.
>       */
>      sigprocmask(SIG_SETMASK, &BlockSig, &save_mask);
>
> Investigating the issue, I found there is a race condition between the
> procsignal initialization and emitting signal barrier that could be
> the cause of this issue. Imagine the following scenario:
>
> 1. In ProcSignalInit(), the checkpointer initializes its
> slot->pss_barrierGeneration with the global generation.
> 2. In EmitProcSignalBarrier(), the startup checks the checkpointer's
> procsignal slot but it skips emitting the signal as slot->pss_pid is
> still 0. It can happen even though the checkpointer holds a spinlock
> on its slot during the initialization because the first pid check is
> done without a spinlock acquisition.
> 3. The checkpointer sets its pid to slot->pss_pid and releases the spin lock.
> 4. In WaitForProcSignalBarrier(), the startup checks the
> checkpointer's procsignal slot that has already initialized the
> pss_barrierGeneration, and waits for it to be updated. However, the
> checkpointer never updates its barrier generation as it doesn't get
> the signal.
>
>
> Thank you for the investigation and explanation of the issue!
>
> I've been puzzled by a buildfarm failure [1] with such symptoms for a while
> and even reproduced it locally once, but couldn't gather more information
> that time. But now that you have described the scenario, I can easily
> reproduce the same test failure with:
> --- a/src/backend/storage/ipc/procsignal.c
> +++ b/src/backend/storage/ipc/procsignal.c
> @@ -206,6 +206,7 @@ ProcSignalInit(const uint8 *cancel_key, int 
> cancel_key_len)
>         if (cancel_key_len > 0)
>                 memcpy(slot->pss_cancel_key, cancel_key, cancel_key_len);
>         slot->pss_cancel_key_len = cancel_key_len;
> +pg_usleep(10000);
>         pg_atomic_write_u32(&slot->pss_pid, MyProcPid);

Thank you for testing this.

I've attached a patch to address the issue. I haven't verified it
across all versions yet, but I suspect it exists in the stable
branches as well. Previously, the issue rarely occurred because
EmitProcSignalBarrier() was only used for smgr invalidation. However,
now that we use signal barriers for online wal_level changes and
checksum status updates, this race condition is likely to be
encountered more frequently.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From 8ed72e1bc748f99fbf8b103ae5bd4cf395cb54ef Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <[email protected]>
Date: Tue, 28 Apr 2026 12:21:21 -0700
Subject: [PATCH v1] Fix race between ProcSignalInit() and
 EmitProcSignalBarrier().

ProcSignalInit() read the global barrier generation before publishing
its PID into pss_pid. A concurrent EmitProcSignalBarrier() iterates
the ProcSignal slots and skips any whose pss_pid is still zero, on the
assumption that such a slot will pick up the new generation when it
later reads psh_barrierGeneration. But because the joining backend had
already read the (older) global generation under its slot's spinlock,
it would store a stale value into pss_barrierGeneration and never
absorb the just-emitted barrier, resulting that
WaitForProcSignalBarrier() didn't complete.

Publish pss_pid before reading psh_barrierGeneration, with a memory
barrier in between so that the store is globally visible first.	A
concurrent EmitProcSignalBarrier() then either observes the published
PID and signals this slot, or completes its generation increment
before we load it.

Discussion: https://postgr.es/m/caeze2wgajmwredn7chtba8er2ybvkcoa0kvn25-1evntrhs...@mail.gmail.com
Backpatch-through:
---
 src/backend/storage/ipc/procsignal.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 264e4c22ca6..b0681ca0ae2 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -188,6 +188,16 @@ ProcSignalInit(const uint8 *cancel_key, int cancel_key_len)
 	/* Clear out any leftover signal reasons */
 	MemSet(slot->pss_signalFlags, 0, NUM_PROCSIGNALS * sizeof(sig_atomic_t));
 
+	/*
+	 * Publish the PID before reading the global barrier generation to ensure
+	 * that EmitProcSignalBarrier() doesn't skip us while we are grabbing an
+	 * older generation. We need a memory barrier here to make sure that the
+	 * update of pss_pid is globally visible before the load of the global
+	 * barrier generation executes.
+	 */
+	pg_atomic_write_u32(&slot->pss_pid, MyProcPid);
+	pg_memory_barrier();
+
 	/*
 	 * Initialize barrier state. Since we're a brand-new process, there
 	 * shouldn't be any leftover backend-private state that needs to be
@@ -207,7 +217,6 @@ ProcSignalInit(const uint8 *cancel_key, int cancel_key_len)
 	if (cancel_key_len > 0)
 		memcpy(slot->pss_cancel_key, cancel_key, cancel_key_len);
 	slot->pss_cancel_key_len = cancel_key_len;
-	pg_atomic_write_u32(&slot->pss_pid, MyProcPid);
 
 	SpinLockRelease(&slot->pss_mutex);
 
-- 
2.54.0

Reply via email to