Hi,

Partially replying here to an email on -committers [1].

On 2024-07-29 13:57:02 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > However, I still don't think it's a problem to assert that the lock is held 
> > in
> > in the unlock "routine". As mentioned before, the spinlock implementation
> > itself has never followed the "just straight line code" rule that users of
> > spinlocks are supposed to follow.
>
> Yeah, that's fair.

Cool.


On 2024-07-29 13:56:05 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > b) Instead define the spinlock to have 1 as the unlocked state and 0 as the
> >    locked state. That makes it a bit harder to understand that 
> > initialization
> >    is missing, compared to a dedicated state, as the first use of the 
> > spinlock
> >    just blocks.
>
> This option works for me.

> > To make 1) b) easier to understand it might be worth changing the slock_t
> > typedef to be something like
>
> > typedef struct slock_t
> > {
> >         char is_free;
> > } slock_t;
>
> +1

Cool. I've attached a prototype.


I just realized there's a nice little advantage to the "inverted"
representation - it detects missing initialization even in optimized builds.


> How much of this would we change across platforms, and how much
> would be x86-only?  I think there are enough people developing on
> ARM (e.g. Mac) now to make it worth covering that, but maybe we
> don't care so much about anything else.

Not sure. Right now I've only hacked up x86-64 (not even touching i386), but
it shouldn't be hard to change at least some additional platforms.

My current prototype requires S_UNLOCK, S_LOCK_FREE, S_INIT_LOCK to be
implemented for x86-64 instead of using the "generic" implementation. That'd
be mildly annoying duplication if we did so for a few more platforms.


It'd be more palatable to just change all platforms if we made more of them
use __sync_lock_test_and_set (or some other intrinsic(s))...

Greetings,

Andres Freund

[1] https://postgr.es/m/2812376.1722275765%40sss.pgh.pa.us
>From 06bd992f4be2f46586bfc7bab3135b3a2ca84e72 Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Mon, 29 Jul 2024 09:48:26 -0700
Subject: [PATCH v2 1/2] Detect unlocking an unlocked spinlock

Author:
Reviewed-by:
Discussion: https://postgr.es/m/20240729164026.yurg37tej34o7...@awork3.anarazel.de
Backpatch:
---
 src/include/storage/spin.h | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/include/storage/spin.h b/src/include/storage/spin.h
index c0679c59992..7ec32cd816a 100644
--- a/src/include/storage/spin.h
+++ b/src/include/storage/spin.h
@@ -61,7 +61,22 @@
 
 #define SpinLockAcquire(lock) S_LOCK(lock)
 
-#define SpinLockRelease(lock) S_UNLOCK(lock)
+static inline void
+SpinLockRelease(volatile slock_t *lock)
+{
+	/*
+	 * Check that this backend actually held the spinlock. Implemented as a
+	 * static inline to avoid multiple-evaluation hazards.
+	 *
+	 * Can't assert when using the semaphore fallback implementation - it
+	 * doesn't implement S_LOCK_FREE() - but the fallback triggers failures in
+	 * the case of double-release on its own.
+	 */
+#ifdef HAVE_SPINLOCKS
+	Assert(!S_LOCK_FREE(lock));
+#endif
+	S_UNLOCK(lock);
+}
 
 #define SpinLockFree(lock)	S_LOCK_FREE(lock)
 
-- 
2.45.2.746.g06e570c0df.dirty

>From 574d02d3d4d7f38b764ca6a2e4d2617a417ab8a8 Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Mon, 29 Jul 2024 10:55:11 -0700
Subject: [PATCH v2 2/2] Detect many uses of uninitialized spinlocks on x86-64

On most platforms we use 0 for a unlocked spinlock and 1 for a locked one. As
we often place spinlocks in zero-initialized memory, missing calls to
SpinLockInit() aren't noticed.

To address that, swap the meaning of the spinlock state and use 1 for unlocked
and 0 for locked. That way using an uninitialized spinlock blocks the first
time a lock is acquired. A dedicated state for initialized locks would allow
for a nicer assertion, but ends up with slightly less dense code (a three byte
cmp with an immediate, instead of a 2 byte test).

To make the swapped meaning of the slock_t state easier to understand when
debugging, change the slock_t to be a struct on x86-64, with an "is_free"
member.

Author:
Reviewed-by:
Discussion: https://postgr.es/m/20240729164026.yurg37tej34o7...@awork3.anarazel.de
Backpatch:
---
 src/include/storage/s_lock.h | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 02c68513a53..18291b284b0 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -205,7 +205,10 @@ spin_delay(void)
 #ifdef __x86_64__		/* AMD Opteron, Intel EM64T */
 #define HAS_TEST_AND_SET
 
-typedef unsigned char slock_t;
+typedef struct slock_t
+{
+	char	is_free;
+} slock_t;
 
 #define TAS(lock) tas(lock)
 
@@ -217,21 +220,27 @@ typedef unsigned char slock_t;
  * and IA32, by Michael Chynoweth and Mary R. Lee. As of this writing, it is
  * available at:
  * http://software.intel.com/en-us/articles/implementing-scalable-atomic-locks-for-multi-core-intel-em64t-and-ia32-architectures
+ *
+ * To catch uses of uninitialized spinlocks, we define 1 as unlocked and 0 as
+ * locked. That causes the first acquisition of an uninitialized spinlock to
+ * block forever. A dedicated state for initialized locks would allow for a
+ * nicer assertion, but ends up with slightly less dense code (a three byte
+ * cmp with an immediate, instead of a 2 byte test).
  */
-#define TAS_SPIN(lock)    (*(lock) ? 1 : TAS(lock))
+#define TAS_SPIN(lock)    ((lock)->is_free == 0 ? 1 : TAS(lock))
 
 static __inline__ int
 tas(volatile slock_t *lock)
 {
-	slock_t		_res = 1;
+	char		was_free = 0;
 
 	__asm__ __volatile__(
 		"	lock			\n"
 		"	xchgb	%0,%1	\n"
-:		"+q"(_res), "+m"(*lock)
+:		"+q"(was_free), "+m"(lock->is_free)
 :		/* no inputs */
 :		"memory", "cc");
-	return (int) _res;
+	return was_free == 0;
 }
 
 #define SPIN_DELAY() spin_delay()
@@ -247,6 +256,13 @@ spin_delay(void)
 		" rep; nop			\n");
 }
 
+#define S_UNLOCK(lock)	\
+	do { __asm__ __volatile__("" : : : "memory");  (lock)->is_free = 1; } while (0)
+
+#define S_LOCK_FREE(lock)	((lock)->is_free == 1)
+
+#define S_INIT_LOCK(lock)	S_UNLOCK(lock)
+
 #endif	 /* __x86_64__ */
 
 
-- 
2.45.2.746.g06e570c0df.dirty

Reply via email to