From 217224d06efc8e2185c6eec89425c9efb2634cc1 Mon Sep 17 00:00:00 2001
From: Amit Langote <amitlan@postgresql.org>
Date: Thu, 18 Dec 2025 17:15:09 +0900
Subject: [PATCH v4] Fix segfault from releasing locks in detached DSM segments

If a FATAL error occurs while holding a lock in a DSM segment (such
as a dshash lock) and the process is not in a transaction, a
segmentation fault can occur during process exit.

The problem sequence is:

 1. Process acquires a lock in a DSM segment (e.g., via dshash)
 2. FATAL error occurs outside transaction context
 3. proc_exit() begins, calling before_shmem_exit callbacks
 4. dsm_backend_shutdown() detaches all DSM segments
 5. Later, on_shmem_exit callbacks run
 6. ProcKill() calls LWLockReleaseAll()
 7. Segfault: the lock being released is in unmapped memory

This only manifests outside transaction contexts because
AbortTransaction() calls LWLockReleaseAll() during transaction
abort, releasing locks before DSM cleanup. Background workers and
other non-transactional code paths are vulnerable.

Fix by calling LWLockReleaseAll() unconditionally at the start of
shmem_exit(), before any callbacks run. Releasing locks before
callbacks prevents the segfault - locks must be released before
dsm_backend_shutdown() detaches their memory. This is safe because
after an error, held locks are protecting potentially inconsistent
data anyway, and callbacks can acquire fresh locks if needed.

Also add a comment noting that LWLockReleaseAll() must be safe to
call before LWLock initialization (which it is, since
num_held_lwlocks will be 0).

This fix aligns with the original design intent from commit
001a573a2, which noted that backends must clean up shared memory
state (including releasing lwlocks) before unmapping dynamic shared
memory segments.

Reported-by: Rahila Syed <rahilasyed90@gmail.com>
Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2L28uSvyiosL+kaic9249jRVoQiQF6JOnaCitKFq=xiFzX3g@mail.gmail.com
Backpatch-through: 14
---
 src/backend/storage/ipc/ipc.c     | 11 +++++++++--
 src/backend/storage/lmgr/lwlock.c |  4 ++++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index 2704e80b3a7..14cde157701 100644
--- a/src/backend/storage/ipc/ipc.c
+++ b/src/backend/storage/ipc/ipc.c
@@ -29,6 +29,7 @@
 #endif
 #include "storage/dsm.h"
 #include "storage/ipc.h"
+#include "storage/lwlock.h"
 #include "tcop/tcopprot.h"
 
 
@@ -229,13 +230,19 @@ shmem_exit(int code)
 {
 	shmem_exit_inprogress = true;
 
+	/*
+	 * Release any LWLocks we might be holding before callbacks run. This
+	 * prevents accessing locks in detached DSM segments and allows callbacks
+	 * to acquire new locks.
+	 */
+	LWLockReleaseAll();
+
 	/*
 	 * Call before_shmem_exit callbacks.
 	 *
 	 * These should be things that need most of the system to still be up and
 	 * working, such as cleanup of temp relations, which requires catalog
-	 * access; or things that need to be completed because later cleanup steps
-	 * depend on them, such as releasing lwlocks.
+	 * access.
 	 */
 	elog(DEBUG3, "shmem_exit(%d): %d before_shmem_exit callbacks to make",
 		 code, before_shmem_exit_index);
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 255cfa8fa95..4cbbbce2cbc 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -1940,6 +1940,10 @@ LWLockReleaseClearVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val)
  * unchanged by this operation.  This is necessary since InterruptHoldoffCount
  * has been set to an appropriate level earlier in error recovery. We could
  * decrement it below zero if we allow it to drop for each released lock!
+ *
+ * Note that this function must be safe to call even before the LWLock
+ * subsystem has been initialized (e.g., during early startup failures).
+ * In that case, num_held_lwlocks will be 0 and we do nothing.
  */
 void
 LWLockReleaseAll(void)
-- 
2.47.3

