On Tue, Jul 2, 2024 at 5:56 AM Tom Lane <t...@sss.pgh.pa.us> wrote:
> Done at commit edadeb0710.

Here are some experimental patches to try out some ideas mentioned
upthread, that are approximately unlocked by that cleanup.

1.  We could get rid of --disable-spinlocks.  It is hard to imagine a
hypothetical new port that would actually be a useful place to run
PostgreSQL where you can't implement spinlocks.  (This one isn't
exactly unlocked by PA-RISC's departure, it's just tangled up with the
relevant cruft.)

2.  We could get rid of --disable-atomics, and require at least 32 bit
lock-free (non-emulated) atomics.  AFAIK there are no relevant systems
that don't have them.  Hypothetical new systems would be unlikely to
omit them, unless they are eg embedded systems that don't intend to be
able to run an OS.

Personally I would like to do this, because I'd like to be able to use
pg_atomic_fetch_or_u32() in a SIGALRM handler in my
latchify-all-the-things patch (a stepping stone in the multi-threading
project as discussed in the Vancouver unconference).  That's not
allowed if it might be a locking fallback.  It's not strictly
necessary for my project, and I could find another way if I have to,
but when contemplating doing extra work to support imaginary computers
that I don't truly believe in... and since this general direction was
suggested already, both on this thread and in the comments in the
tree...

Once you decide to do #2, ie require atomics, perhaps you could also
implement spinlocks with them, rendering point #1 moot, and delete all
that hand-rolled TAS stuff.  (Then you'd have spinlocks implemented
with flag/u32 atomics APIs, but potentially also u64 atomics
implemented with spinlocks!  Circular, but not impossible AFAICT.
Assuming we can't require 64 bit lock-free atomics any time soon that
is, not considered).  🤯🤯🤯But maybe there are still good reasons to
have hand-rolled specialisations in some cases?  I have not researched
that idea and eg compared the generated instructions... I do
appreciate that that code reflects a lot of accumulated wisdom and
experience that I don't claim to possess, and this bit is vapourware
anyway.

3.  While tinkering with the above, and contemplating contact with
hypothetical future systems and even existing oddball systems, it
practically suggests itself that we could allow <stdatomic.h> as a way
of providing atomics (port/atomics.h clearly anticipated that, it was
just too soon).  Note: that's different from requiring C11, but it
means that the new rule would be that your system should have *either*
C11 <stdatomic.h> or a hand-rolled implementation in port/atomics/*.h.
This is not a proposal, just an early stage experiment to test the
waters!

Some early thoughts about that, not fully explored:
* Since C11 uses funky generics, perhaps we might want to add some
type checks to make sure you don't accidentally confuse u32 and u64
somewhere.
* I couldn't immediately see how to use the standard atomic_flag for
our stuff due to lack of relaxed load, so it's falling back to the
generic u32 implementation (a small waste of space).  atomic_bool or
atomic_char should work fine though, not tried.  I guess
pg_atomic_flag might be a minor historical mistake, assuming it was
supposed to be just like the standard type of the same name.  Or maybe
I'm missing something.
* The pg_spin_delay_impl() part definitely still needs hand-rolled
magic still when using <stdatomic.h> (I'm not aware of any standard
way to do that).  But I'm not sure it even belongs in the "atomics"
headers anyway?  It's not the same kind of thing, is it?
* The comments seem to imply that we need to tell the compiler not to
generate any code for read/write barriers on TSO systems (compiler
barrier only), but AFAICS the right thing happens anyway when coded as
standard acquire/release barriers.  x86: nothing.  ARM: something.
What am I missing?
* It'd be interesting to learn about anything else that modern tool
chains might do worse than our hand-rolled wisdom.
* Special support for Sun's compiler could be dropped if we could just
use their <stdatomic.h>.  The same applies for MSVC 2022+ AFAICS, so
maybe in ~3 years from now we could drop the Windows-specific code.
* Uhh, yeah, so that would also apply to any modern GCC/Clang, so in
effect everyone would be using <stdatomic.h> except any hand-rolled
special bits that we decide to keep for performance reasons, and the
rest would become dead code and liable for garbage collection.  So
that would amount to a confusing policy like: "we require
<stdatomic.h> with at least lock-free int in practice, but we'd
consider patches to add a non-C11-way to do this stuff if you invent a
new kind of computer/toolchain and refuse to support C11".  Hmm.  (I
have another version of this type of thinking happening in another
pending patch, the pg_threads.h one, more on that shortly...)
From fb79e823712b126d1701fe4aa8d8bd5c4def2e75 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Tue, 2 Jul 2024 14:47:59 +1200
Subject: [PATCH v1 1/3] Remove --disable-spinlocks.

The --disable-spinlocks build option was provided to allow new systems
unknown to s_lock.h/c to be brought up.  It is not expected to be
difficult to implement the required code for new systems.
---
 configure                               |  40 ------
 configure.ac                            |  13 --
 doc/src/sgml/installation.sgml          |  37 +----
 meson.build                             |   6 -
 src/backend/port/atomics.c              |  26 ----
 src/backend/postmaster/launch_backend.c |   9 --
 src/backend/storage/ipc/ipci.c          |  10 --
 src/backend/storage/lmgr/Makefile       |   1 -
 src/backend/storage/lmgr/meson.build    |   1 -
 src/backend/storage/lmgr/s_lock.c       |   2 +-
 src/backend/storage/lmgr/spin.c         | 180 ------------------------
 src/include/pg_config.h.in              |   3 -
 src/include/pg_config_manual.h          |  15 --
 src/include/port/atomics.h              |   4 +-
 src/include/storage/s_lock.h            |  39 +----
 src/include/storage/spin.h              |  18 +--
 src/test/regress/regress.c              |  86 -----------
 17 files changed, 10 insertions(+), 480 deletions(-)
 delete mode 100644 src/backend/storage/lmgr/spin.c

diff --git a/configure b/configure
index 76f06bd8fda..3a577f96e8f 100755
--- a/configure
+++ b/configure
@@ -836,7 +836,6 @@ enable_integer_datetimes
 enable_nls
 with_pgport
 enable_rpath
-enable_spinlocks
 enable_atomics
 enable_debug
 enable_profiling
@@ -1529,7 +1528,6 @@ Optional Features:
                           enable Native Language Support
   --disable-rpath         do not embed shared library search path in
                           executables
-  --disable-spinlocks     do not use spinlocks
   --disable-atomics       do not use atomic operations
   --enable-debug          build with debugging symbols (-g)
   --enable-profiling      build with profiling enabled
@@ -3266,33 +3264,6 @@ fi
 
 
 
-#
-# Spinlocks
-#
-
-
-# Check whether --enable-spinlocks was given.
-if test "${enable_spinlocks+set}" = set; then :
-  enableval=$enable_spinlocks;
-  case $enableval in
-    yes)
-      :
-      ;;
-    no)
-      :
-      ;;
-    *)
-      as_fn_error $? "no argument expected for --enable-spinlocks option" "$LINENO" 5
-      ;;
-  esac
-
-else
-  enable_spinlocks=yes
-
-fi
-
-
-
 #
 # Atomic operations
 #
@@ -12185,17 +12156,6 @@ fi
 
 fi
 
-if test "$enable_spinlocks" = yes; then
-
-$as_echo "#define HAVE_SPINLOCKS 1" >>confdefs.h
-
-else
-  { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING:
-*** Not using spinlocks will cause poor performance." >&5
-$as_echo "$as_me: WARNING:
-*** Not using spinlocks will cause poor performance." >&2;}
-fi
-
 if test "$enable_atomics" = yes; then
 
 $as_echo "#define HAVE_ATOMICS 1" >>confdefs.h
diff --git a/configure.ac b/configure.ac
index ab2d51c21ce..2dd8c1613fb 100644
--- a/configure.ac
+++ b/configure.ac
@@ -186,12 +186,6 @@ PGAC_ARG_BOOL(enable, rpath, yes,
               [do not embed shared library search path in executables])
 AC_SUBST(enable_rpath)
 
-#
-# Spinlocks
-#
-PGAC_ARG_BOOL(enable, spinlocks, yes,
-              [do not use spinlocks])
-
 #
 # Atomic operations
 #
@@ -1296,13 +1290,6 @@ failure.  It is possible the compiler isn't looking in the proper directory.
 Use --without-zlib to disable zlib support.])])
 fi
 
-if test "$enable_spinlocks" = yes; then
-  AC_DEFINE(HAVE_SPINLOCKS, 1, [Define to 1 if you have spinlocks.])
-else
-  AC_MSG_WARN([
-*** Not using spinlocks will cause poor performance.])
-fi
-
 if test "$enable_atomics" = yes; then
   AC_DEFINE(HAVE_ATOMICS, 1, [Define to 1 if you want to use atomics if available.])
 else
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 4784834ab9f..3f19f272b17 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -1258,22 +1258,6 @@ build-postgresql:
        </listitem>
       </varlistentry>
 
-      <varlistentry id="configure-option-disable-spinlocks">
-       <term><option>--disable-spinlocks</option></term>
-       <listitem>
-        <para>
-         Allow the build to succeed even if <productname>PostgreSQL</productname>
-         has no CPU spinlock support for the platform.  The lack of
-         spinlock support will result in very poor performance; therefore,
-         this option should only be used if the build aborts and
-         informs you that the platform lacks spinlock support. If this
-         option is required to build <productname>PostgreSQL</productname> on
-         your platform, please report the problem to the
-         <productname>PostgreSQL</productname> developers.
-        </para>
-       </listitem>
-      </varlistentry>
-
       <varlistentry id="configure-option-disable-atomics">
        <term><option>--disable-atomics</option></term>
        <listitem>
@@ -2690,23 +2674,6 @@ ninja install
       </listitem>
      </varlistentry>
 
-     <varlistentry id="configure-spinlocks-meson">
-      <term><option>-Dspinlocks={ true | false }</option></term>
-      <listitem>
-       <para>
-        This option is set to true by default; setting it to false will
-        allow the build to succeed even if <productname>PostgreSQL</productname>
-        has no CPU spinlock support for the platform.  The lack of
-        spinlock support will result in very poor performance; therefore,
-        this option should only be changed if the build aborts and
-        informs you that the platform lacks spinlock support. If setting this
-        option to false is required to build <productname>PostgreSQL</productname> on
-        your platform, please report the problem to the
-        <productname>PostgreSQL</productname> developers.
-       </para>
-      </listitem>
-     </varlistentry>
-
      <varlistentry id="configure-atomics-meson">
       <term><option>-Datomics={ true | false }</option></term>
       <listitem>
@@ -2719,6 +2686,7 @@ ninja install
        </para>
       </listitem>
      </varlistentry>
+
     </variablelist>
    </sect3>
 
@@ -3393,9 +3361,6 @@ export MANPATH
    these CPU architectures: x86, PowerPC, S/390, SPARC, ARM, MIPS,
    and RISC-V, including
    big-endian, little-endian, 32-bit, and 64-bit variants where applicable.
-   It is often
-   possible to build on an unsupported CPU type by configuring with
-   <option>--disable-spinlocks</option>, but performance will be poor.
   </para>
 
   <para>
diff --git a/meson.build b/meson.build
index 5387bb6d5fd..0d569e7a240 100644
--- a/meson.build
+++ b/meson.build
@@ -1983,12 +1983,6 @@ endif
 # Atomics
 ###############################################################
 
-if not get_option('spinlocks')
-  warning('Not using spinlocks will cause poor performance')
-else
-  cdata.set('HAVE_SPINLOCKS', 1)
-endif
-
 if not get_option('atomics')
   warning('Not using atomics will cause poor performance')
 else
diff --git a/src/backend/port/atomics.c b/src/backend/port/atomics.c
index 93789b4e058..cd7ede96726 100644
--- a/src/backend/port/atomics.c
+++ b/src/backend/port/atomics.c
@@ -57,17 +57,7 @@ pg_atomic_init_flag_impl(volatile pg_atomic_flag *ptr)
 	StaticAssertDecl(sizeof(ptr->sema) >= sizeof(slock_t),
 					 "size mismatch of atomic_flag vs slock_t");
 
-#ifndef HAVE_SPINLOCKS
-
-	/*
-	 * NB: If we're using semaphore based TAS emulation, be careful to use a
-	 * separate set of semaphores. Otherwise we'd get in trouble if an atomic
-	 * var would be manipulated while spinlock is held.
-	 */
-	s_init_lock_sema((slock_t *) &ptr->sema, true);
-#else
 	SpinLockInit((slock_t *) &ptr->sema);
-#endif
 
 	ptr->value = false;
 }
@@ -108,15 +98,7 @@ pg_atomic_init_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 val_)
 	StaticAssertDecl(sizeof(ptr->sema) >= sizeof(slock_t),
 					 "size mismatch of atomic_uint32 vs slock_t");
 
-	/*
-	 * If we're using semaphore based atomic flags, be careful about nested
-	 * usage of atomics while a spinlock is held.
-	 */
-#ifndef HAVE_SPINLOCKS
-	s_init_lock_sema((slock_t *) &ptr->sema, true);
-#else
 	SpinLockInit((slock_t *) &ptr->sema);
-#endif
 	ptr->value = val_;
 }
 
@@ -184,15 +166,7 @@ pg_atomic_init_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 val_)
 	StaticAssertDecl(sizeof(ptr->sema) >= sizeof(slock_t),
 					 "size mismatch of atomic_uint64 vs slock_t");
 
-	/*
-	 * If we're using semaphore based atomic flags, be careful about nested
-	 * usage of atomics while a spinlock is held.
-	 */
-#ifndef HAVE_SPINLOCKS
-	s_init_lock_sema((slock_t *) &ptr->sema, true);
-#else
 	SpinLockInit((slock_t *) &ptr->sema);
-#endif
 	ptr->value = val_;
 }
 
diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index bdfa238e4fe..04221a55e01 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -104,9 +104,6 @@ typedef struct
 	void	   *UsedShmemSegAddr;
 	slock_t    *ShmemLock;
 	struct bkend *ShmemBackendArray;
-#ifndef HAVE_SPINLOCKS
-	PGSemaphore *SpinlockSemaArray;
-#endif
 	int			NamedLWLockTrancheRequests;
 	NamedLWLockTranche *NamedLWLockTrancheArray;
 	LWLockPadded *MainLWLockArray;
@@ -722,9 +719,6 @@ save_backend_variables(BackendParameters *param, ClientSocket *client_sock,
 	param->ShmemLock = ShmemLock;
 	param->ShmemBackendArray = ShmemBackendArray;
 
-#ifndef HAVE_SPINLOCKS
-	param->SpinlockSemaArray = SpinlockSemaArray;
-#endif
 	param->NamedLWLockTrancheRequests = NamedLWLockTrancheRequests;
 	param->NamedLWLockTrancheArray = NamedLWLockTrancheArray;
 	param->MainLWLockArray = MainLWLockArray;
@@ -980,9 +974,6 @@ restore_backend_variables(BackendParameters *param)
 	ShmemLock = param->ShmemLock;
 	ShmemBackendArray = param->ShmemBackendArray;
 
-#ifndef HAVE_SPINLOCKS
-	SpinlockSemaArray = param->SpinlockSemaArray;
-#endif
 	NamedLWLockTrancheRequests = param->NamedLWLockTrancheRequests;
 	NamedLWLockTrancheArray = param->NamedLWLockTrancheArray;
 	MainLWLockArray = param->MainLWLockArray;
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 2100150f01c..921a93588f6 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -94,7 +94,6 @@ CalculateShmemSize(int *num_semaphores)
 
 	/* Compute number of semaphores we'll need */
 	numSemas = ProcGlobalSemas();
-	numSemas += SpinlockSemas();
 
 	/* Return the number of semaphores if requested by the caller */
 	if (num_semaphores)
@@ -111,7 +110,6 @@ CalculateShmemSize(int *num_semaphores)
 	 */
 	size = 100000;
 	size = add_size(size, PGSemaphoreShmemSize(numSemas));
-	size = add_size(size, SpinlockSemaSize());
 	size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
 											 sizeof(ShmemIndexEnt)));
 	size = add_size(size, dsm_estimate_size());
@@ -228,14 +226,6 @@ CreateSharedMemoryAndSemaphores(void)
 	 */
 	PGReserveSemaphores(numSemas);
 
-	/*
-	 * If spinlocks are disabled, initialize emulation layer (which depends on
-	 * semaphores, so the order is important here).
-	 */
-#ifndef HAVE_SPINLOCKS
-	SpinlockSemaInit();
-#endif
-
 	/*
 	 * Set up shared memory allocation mechanism
 	 */
diff --git a/src/backend/storage/lmgr/Makefile b/src/backend/storage/lmgr/Makefile
index 3f89548bde6..6cbaf23b855 100644
--- a/src/backend/storage/lmgr/Makefile
+++ b/src/backend/storage/lmgr/Makefile
@@ -21,7 +21,6 @@ OBJS = \
 	predicate.o \
 	proc.o \
 	s_lock.o \
-	spin.o
 
 include $(top_srcdir)/src/backend/common.mk
 
diff --git a/src/backend/storage/lmgr/meson.build b/src/backend/storage/lmgr/meson.build
index 05ac41e809a..d43511925e1 100644
--- a/src/backend/storage/lmgr/meson.build
+++ b/src/backend/storage/lmgr/meson.build
@@ -9,5 +9,4 @@ backend_sources += files(
   'predicate.c',
   'proc.c',
   's_lock.c',
-  'spin.c',
 )
diff --git a/src/backend/storage/lmgr/s_lock.c b/src/backend/storage/lmgr/s_lock.c
index cba48b3e778..69549a65dba 100644
--- a/src/backend/storage/lmgr/s_lock.c
+++ b/src/backend/storage/lmgr/s_lock.c
@@ -1,7 +1,7 @@
 /*-------------------------------------------------------------------------
  *
  * s_lock.c
- *	   Hardware-dependent implementation of spinlocks.
+ *	   Implementation of spinlocks.
  *
  * When waiting for a contended spinlock we loop tightly for awhile, then
  * delay using pg_usleep() and try again.  Preferably, "awhile" should be a
diff --git a/src/backend/storage/lmgr/spin.c b/src/backend/storage/lmgr/spin.c
deleted file mode 100644
index 50cb99cd3b6..00000000000
--- a/src/backend/storage/lmgr/spin.c
+++ /dev/null
@@ -1,180 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * spin.c
- *	   Hardware-independent implementation of spinlocks.
- *
- *
- * For machines that have test-and-set (TAS) instructions, s_lock.h/.c
- * define the spinlock implementation.  This file contains only a stub
- * implementation for spinlocks using PGSemaphores.  Unless semaphores
- * are implemented in a way that doesn't involve a kernel call, this
- * is too slow to be very useful :-(
- *
- *
- * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- *
- * IDENTIFICATION
- *	  src/backend/storage/lmgr/spin.c
- *
- *-------------------------------------------------------------------------
- */
-#include "postgres.h"
-
-#include "storage/pg_sema.h"
-#include "storage/shmem.h"
-#include "storage/spin.h"
-
-
-#ifndef HAVE_SPINLOCKS
-
-/*
- * No TAS, so spinlocks are implemented as PGSemaphores.
- */
-
-#ifndef HAVE_ATOMICS
-#define NUM_EMULATION_SEMAPHORES (NUM_SPINLOCK_SEMAPHORES + NUM_ATOMICS_SEMAPHORES)
-#else
-#define NUM_EMULATION_SEMAPHORES (NUM_SPINLOCK_SEMAPHORES)
-#endif							/* HAVE_ATOMICS */
-
-PGSemaphore *SpinlockSemaArray;
-
-#else							/* !HAVE_SPINLOCKS */
-
-#define NUM_EMULATION_SEMAPHORES 0
-
-#endif							/* HAVE_SPINLOCKS */
-
-/*
- * Report the amount of shared memory needed to store semaphores for spinlock
- * support.
- */
-Size
-SpinlockSemaSize(void)
-{
-	return NUM_EMULATION_SEMAPHORES * sizeof(PGSemaphore);
-}
-
-/*
- * Report number of semaphores needed to support spinlocks.
- */
-int
-SpinlockSemas(void)
-{
-	return NUM_EMULATION_SEMAPHORES;
-}
-
-#ifndef HAVE_SPINLOCKS
-
-/*
- * Initialize spinlock emulation.
- *
- * This must be called after PGReserveSemaphores().
- */
-void
-SpinlockSemaInit(void)
-{
-	PGSemaphore *spinsemas;
-	int			nsemas = SpinlockSemas();
-	int			i;
-
-	/*
-	 * We must use ShmemAllocUnlocked(), since the spinlock protecting
-	 * ShmemAlloc() obviously can't be ready yet.
-	 */
-	spinsemas = (PGSemaphore *) ShmemAllocUnlocked(SpinlockSemaSize());
-	for (i = 0; i < nsemas; ++i)
-		spinsemas[i] = PGSemaphoreCreate();
-	SpinlockSemaArray = spinsemas;
-}
-
-/*
- * s_lock.h hardware-spinlock emulation using semaphores
- *
- * We map all spinlocks onto NUM_EMULATION_SEMAPHORES semaphores.  It's okay to
- * map multiple spinlocks onto one semaphore because no process should ever
- * hold more than one at a time.  We just need enough semaphores so that we
- * aren't adding too much extra contention from that.
- *
- * There is one exception to the restriction of only holding one spinlock at a
- * time, which is that it's ok if emulated atomic operations are nested inside
- * spinlocks. To avoid the danger of spinlocks and atomic using the same sema,
- * we make sure "normal" spinlocks and atomics backed by spinlocks use
- * distinct semaphores (see the nested argument to s_init_lock_sema).
- *
- * slock_t is just an int for this implementation; it holds the spinlock
- * number from 1..NUM_EMULATION_SEMAPHORES.  We intentionally ensure that 0
- * is not a valid value, so that testing with this code can help find
- * failures to initialize spinlocks.
- */
-
-static inline void
-s_check_valid(int lockndx)
-{
-	if (unlikely(lockndx <= 0 || lockndx > NUM_EMULATION_SEMAPHORES))
-		elog(ERROR, "invalid spinlock number: %d", lockndx);
-}
-
-void
-s_init_lock_sema(volatile slock_t *lock, bool nested)
-{
-	static uint32 counter = 0;
-	uint32		offset;
-	uint32		sema_total;
-	uint32		idx;
-
-	if (nested)
-	{
-		/*
-		 * To allow nesting atomics inside spinlocked sections, use a
-		 * different spinlock. See comment above.
-		 */
-		offset = 1 + NUM_SPINLOCK_SEMAPHORES;
-		sema_total = NUM_ATOMICS_SEMAPHORES;
-	}
-	else
-	{
-		offset = 1;
-		sema_total = NUM_SPINLOCK_SEMAPHORES;
-	}
-
-	idx = (counter++ % sema_total) + offset;
-
-	/* double check we did things correctly */
-	s_check_valid(idx);
-
-	*lock = idx;
-}
-
-void
-s_unlock_sema(volatile slock_t *lock)
-{
-	int			lockndx = *lock;
-
-	s_check_valid(lockndx);
-
-	PGSemaphoreUnlock(SpinlockSemaArray[lockndx - 1]);
-}
-
-bool
-s_lock_free_sema(volatile slock_t *lock)
-{
-	/* We don't currently use S_LOCK_FREE anyway */
-	elog(ERROR, "spin.c does not support S_LOCK_FREE()");
-	return false;
-}
-
-int
-tas_sema(volatile slock_t *lock)
-{
-	int			lockndx = *lock;
-
-	s_check_valid(lockndx);
-
-	/* Note that TAS macros return 0 if *success* */
-	return !PGSemaphoreTryLock(SpinlockSemaArray[lockndx - 1]);
-}
-
-#endif							/* !HAVE_SPINLOCKS */
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index f8d3e3b6b84..ed616278b14 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -378,9 +378,6 @@
 /* Define to 1 if the system has the type `socklen_t'. */
 #undef HAVE_SOCKLEN_T
 
-/* Define to 1 if you have spinlocks. */
-#undef HAVE_SPINLOCKS
-
 /* Define to 1 if you have the `SSL_CTX_set_cert_cb' function. */
 #undef HAVE_SSL_CTX_SET_CERT_CB
 
diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h
index f941ee2faf8..11f74f4b56d 100644
--- a/src/include/pg_config_manual.h
+++ b/src/include/pg_config_manual.h
@@ -86,21 +86,6 @@
 #define USE_FLOAT8_BYVAL 1
 #endif
 
-/*
- * When we don't have native spinlocks, we use semaphores to simulate them.
- * Decreasing this value reduces consumption of OS resources; increasing it
- * may improve performance, but supplying a real spinlock implementation is
- * probably far better.
- */
-#define NUM_SPINLOCK_SEMAPHORES		128
-
-/*
- * When we have neither spinlocks nor atomic operations support we're
- * implementing atomic operations on top of spinlock on top of semaphores. To
- * be safe against atomic operations while holding a spinlock separate
- * semaphores have to be used.
- */
-#define NUM_ATOMICS_SEMAPHORES		64
 
 /*
  * MAXPGPATH: standard size of a pathname buffer in PostgreSQL (hence,
diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h
index c911c6b9564..f92a9f62ba4 100644
--- a/src/include/port/atomics.h
+++ b/src/include/port/atomics.h
@@ -16,8 +16,8 @@
  *
  * There exist generic, hardware independent, implementations for several
  * compilers which might be sufficient, although possibly not optimal, for a
- * new platform. If no such generic implementation is available spinlocks (or
- * even OS provided semaphores) will be used to implement the API.
+ * new platform. If no such generic implementation is available spinlocks will
+ * be used to implement the API.
  *
  * Implement _u64 atomics if and only if your platform can use them
  * efficiently (and obviously correctly).
diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 02c68513a53..e94ed5f48bd 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -1,10 +1,10 @@
 /*-------------------------------------------------------------------------
  *
  * s_lock.h
- *	   Hardware-dependent implementation of spinlocks.
+ *	   Implementation of spinlocks.
  *
  *	NOTE: none of the macros in this file are intended to be called directly.
- *	Call them through the hardware-independent macros in spin.h.
+ *	Call them through the macros in spin.h.
  *
  *	The following hardware-dependent macros must be provided for each
  *	supported platform:
@@ -78,13 +78,6 @@
  *	in assembly language to execute a hardware atomic-test-and-set
  *	instruction.  Equivalent OS-supplied mutex routines could be used too.
  *
- *	If no system-specific TAS() is available (ie, HAVE_SPINLOCKS is not
- *	defined), then we fall back on an emulation that uses SysV semaphores
- *	(see spin.c).  This emulation will be MUCH MUCH slower than a proper TAS()
- *	implementation, because of the cost of a kernel call per lock or unlock.
- *	An old report is that Postgres spends around 40% of its time in semop(2)
- *	when using the SysV semaphore code.
- *
  *
  * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -100,8 +93,6 @@
 #error "s_lock.h may not be included from frontend code"
 #endif
 
-#ifdef HAVE_SPINLOCKS	/* skip spinlocks if requested */
-
 #if defined(__GNUC__) || defined(__INTEL_COMPILER)
 /*************************************************************************
  * All the gcc inlines
@@ -655,34 +646,10 @@ spin_delay(void)
 
 /* Blow up if we didn't have any way to do spinlocks */
 #ifndef HAS_TEST_AND_SET
-#error PostgreSQL does not have native spinlock support on this platform.  To continue the compilation, rerun configure using --disable-spinlocks.  However, performance will be poor.  Please report this to pgsql-b...@lists.postgresql.org.
+#error PostgreSQL does not have spinlock support on this platform.  Please report this to pgsql-b...@lists.postgresql.org.
 #endif
 
 
-#else	/* !HAVE_SPINLOCKS */
-
-
-/*
- * Fake spinlock implementation using semaphores --- slow and prone
- * to fall foul of kernel limits on number of semaphores, so don't use this
- * unless you must!  The subroutines appear in spin.c.
- */
-typedef int slock_t;
-
-extern bool s_lock_free_sema(volatile slock_t *lock);
-extern void s_unlock_sema(volatile slock_t *lock);
-extern void s_init_lock_sema(volatile slock_t *lock, bool nested);
-extern int	tas_sema(volatile slock_t *lock);
-
-#define S_LOCK_FREE(lock)	s_lock_free_sema(lock)
-#define S_UNLOCK(lock)	 s_unlock_sema(lock)
-#define S_INIT_LOCK(lock)	s_init_lock_sema(lock, false)
-#define TAS(lock)	tas_sema(lock)
-
-
-#endif	/* HAVE_SPINLOCKS */
-
-
 /*
  * Default Definitions - override these above as needed.
  */
diff --git a/src/include/storage/spin.h b/src/include/storage/spin.h
index c0679c59992..3ae2a56d073 100644
--- a/src/include/storage/spin.h
+++ b/src/include/storage/spin.h
@@ -1,11 +1,11 @@
 /*-------------------------------------------------------------------------
  *
  * spin.h
- *	   Hardware-independent implementation of spinlocks.
+ *	   API for spinlocks.
  *
  *
- *	The hardware-independent interface to spinlocks is defined by the
- *	typedef "slock_t" and these macros:
+ *	The interface to spinlocks is defined by the typedef "slock_t" and
+ *	these macros:
  *
  *	void SpinLockInit(volatile slock_t *lock)
  *		Initialize a spinlock (to the unlocked state).
@@ -52,9 +52,6 @@
 #define SPIN_H
 
 #include "storage/s_lock.h"
-#ifndef HAVE_SPINLOCKS
-#include "storage/pg_sema.h"
-#endif
 
 
 #define SpinLockInit(lock)	S_INIT_LOCK(lock)
@@ -65,13 +62,4 @@
 
 #define SpinLockFree(lock)	S_LOCK_FREE(lock)
 
-
-extern int	SpinlockSemas(void);
-extern Size SpinlockSemaSize(void);
-
-#ifndef HAVE_SPINLOCKS
-extern void SpinlockSemaInit(void);
-extern PGDLLIMPORT PGSemaphore *SpinlockSemaArray;
-#endif
-
 #endif							/* SPIN_H */
diff --git a/src/test/regress/regress.c b/src/test/regress/regress.c
index 45a6ad3c49e..14aad5a0c6e 100644
--- a/src/test/regress/regress.c
+++ b/src/test/regress/regress.c
@@ -887,91 +887,7 @@ test_spinlock(void)
 		if (memcmp(struct_w_lock.data_after, "ef12", 4) != 0)
 			elog(ERROR, "padding after spinlock modified");
 	}
-
-	/*
-	 * Ensure that allocating more than INT32_MAX emulated spinlocks works.
-	 * That's interesting because the spinlock emulation uses a 32bit integer
-	 * to map spinlocks onto semaphores. There've been bugs...
-	 */
-#ifndef HAVE_SPINLOCKS
-	{
-		/*
-		 * Initialize enough spinlocks to advance counter close to wraparound.
-		 * It's too expensive to perform acquire/release for each, as those
-		 * may be syscalls when the spinlock emulation is used (and even just
-		 * atomic TAS would be expensive).
-		 */
-		for (uint32 i = 0; i < INT32_MAX - 100000; i++)
-		{
-			slock_t		lock;
-
-			SpinLockInit(&lock);
-		}
-
-		for (uint32 i = 0; i < 200000; i++)
-		{
-			slock_t		lock;
-
-			SpinLockInit(&lock);
-
-			SpinLockAcquire(&lock);
-			SpinLockRelease(&lock);
-			SpinLockAcquire(&lock);
-			SpinLockRelease(&lock);
-		}
-	}
-#endif
-}
-
-/*
- * Verify that performing atomic ops inside a spinlock isn't a
- * problem. Realistically that's only going to be a problem when both
- * --disable-spinlocks and --disable-atomics are used, but it's cheap enough
- * to just always test.
- *
- * The test works by initializing enough atomics that we'd conflict if there
- * were an overlap between a spinlock and an atomic by holding a spinlock
- * while manipulating more than NUM_SPINLOCK_SEMAPHORES atomics.
- *
- * NUM_TEST_ATOMICS doesn't really need to be more than
- * NUM_SPINLOCK_SEMAPHORES, but it seems better to test a bit more
- * extensively.
- */
-static void
-test_atomic_spin_nest(void)
-{
-	slock_t		lock;
-#define NUM_TEST_ATOMICS (NUM_SPINLOCK_SEMAPHORES + NUM_ATOMICS_SEMAPHORES + 27)
-	pg_atomic_uint32 atomics32[NUM_TEST_ATOMICS];
-	pg_atomic_uint64 atomics64[NUM_TEST_ATOMICS];
-
-	SpinLockInit(&lock);
-
-	for (int i = 0; i < NUM_TEST_ATOMICS; i++)
-	{
-		pg_atomic_init_u32(&atomics32[i], 0);
-		pg_atomic_init_u64(&atomics64[i], 0);
-	}
-
-	/* just so it's not all zeroes */
-	for (int i = 0; i < NUM_TEST_ATOMICS; i++)
-	{
-		EXPECT_EQ_U32(pg_atomic_fetch_add_u32(&atomics32[i], i), 0);
-		EXPECT_EQ_U64(pg_atomic_fetch_add_u64(&atomics64[i], i), 0);
-	}
-
-	/* test whether we can do atomic op with lock held */
-	SpinLockAcquire(&lock);
-	for (int i = 0; i < NUM_TEST_ATOMICS; i++)
-	{
-		EXPECT_EQ_U32(pg_atomic_fetch_sub_u32(&atomics32[i], i), i);
-		EXPECT_EQ_U32(pg_atomic_read_u32(&atomics32[i]), 0);
-		EXPECT_EQ_U64(pg_atomic_fetch_sub_u64(&atomics64[i], i), i);
-		EXPECT_EQ_U64(pg_atomic_read_u64(&atomics64[i]), 0);
-	}
-	SpinLockRelease(&lock);
 }
-#undef NUM_TEST_ATOMICS
 
 PG_FUNCTION_INFO_V1(test_atomic_ops);
 Datum
@@ -989,8 +905,6 @@ test_atomic_ops(PG_FUNCTION_ARGS)
 	 */
 	test_spinlock();
 
-	test_atomic_spin_nest();
-
 	PG_RETURN_BOOL(true);
 }
 
-- 
2.45.2

From 42f94a323aefffb86bbe02cbb15841d0a93b0abd Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Tue, 2 Jul 2024 15:13:00 +1200
Subject: [PATCH v1 2/3] Remove --disable-atomics, require 32 bit atomics.

All target systems have real atomics support.  We're not yet ready to
use C11 <stdatomic.h>, but since edadeb07 there is no remaining reason
to carry code that simulates atomic flag and uint32 imperfectly with
spinlocks.
---
 configure                                 |  40 --------
 configure.ac                              |  13 ---
 doc/src/sgml/installation.sgml            |  25 -----
 meson.build                               |  65 ++++++-------
 src/backend/port/atomics.c                | 109 ----------------------
 src/include/pg_config.h.in                |   3 -
 src/include/port/atomics.h                |   7 +-
 src/include/port/atomics/arch-x86.h       |   8 --
 src/include/port/atomics/fallback.h       |  87 +----------------
 src/include/port/atomics/generic-gcc.h    |   4 -
 src/include/port/atomics/generic-msvc.h   |   4 -
 src/include/port/atomics/generic-sunpro.h |   8 --
 12 files changed, 32 insertions(+), 341 deletions(-)

diff --git a/configure b/configure
index 3a577f96e8f..4c73e46e246 100755
--- a/configure
+++ b/configure
@@ -836,7 +836,6 @@ enable_integer_datetimes
 enable_nls
 with_pgport
 enable_rpath
-enable_atomics
 enable_debug
 enable_profiling
 enable_coverage
@@ -1528,7 +1527,6 @@ Optional Features:
                           enable Native Language Support
   --disable-rpath         do not embed shared library search path in
                           executables
-  --disable-atomics       do not use atomic operations
   --enable-debug          build with debugging symbols (-g)
   --enable-profiling      build with profiling enabled
   --enable-coverage       build with coverage testing instrumentation
@@ -3264,33 +3262,6 @@ fi
 
 
 
-#
-# Atomic operations
-#
-
-
-# Check whether --enable-atomics was given.
-if test "${enable_atomics+set}" = set; then :
-  enableval=$enable_atomics;
-  case $enableval in
-    yes)
-      :
-      ;;
-    no)
-      :
-      ;;
-    *)
-      as_fn_error $? "no argument expected for --enable-atomics option" "$LINENO" 5
-      ;;
-  esac
-
-else
-  enable_atomics=yes
-
-fi
-
-
-
 #
 # --enable-debug adds -g to compiler flags
 #
@@ -12156,17 +12127,6 @@ fi
 
 fi
 
-if test "$enable_atomics" = yes; then
-
-$as_echo "#define HAVE_ATOMICS 1" >>confdefs.h
-
-else
-  { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING:
-*** Not using atomic operations will cause poor performance." >&5
-$as_echo "$as_me: WARNING:
-*** Not using atomic operations will cause poor performance." >&2;}
-fi
-
 if test "$with_gssapi" = yes ; then
   if test "$PORTNAME" != "win32"; then
     { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gss_store_cred_into" >&5
diff --git a/configure.ac b/configure.ac
index 2dd8c1613fb..8ee3d2cabb9 100644
--- a/configure.ac
+++ b/configure.ac
@@ -186,12 +186,6 @@ PGAC_ARG_BOOL(enable, rpath, yes,
               [do not embed shared library search path in executables])
 AC_SUBST(enable_rpath)
 
-#
-# Atomic operations
-#
-PGAC_ARG_BOOL(enable, atomics, yes,
-              [do not use atomic operations])
-
 #
 # --enable-debug adds -g to compiler flags
 #
@@ -1290,13 +1284,6 @@ failure.  It is possible the compiler isn't looking in the proper directory.
 Use --without-zlib to disable zlib support.])])
 fi
 
-if test "$enable_atomics" = yes; then
-  AC_DEFINE(HAVE_ATOMICS, 1, [Define to 1 if you want to use atomics if available.])
-else
-  AC_MSG_WARN([
-*** Not using atomic operations will cause poor performance.])
-fi
-
 if test "$with_gssapi" = yes ; then
   if test "$PORTNAME" != "win32"; then
     AC_SEARCH_LIBS(gss_store_cred_into, [gssapi_krb5 gss 'gssapi -lkrb5 -lcrypto'], [],
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 3f19f272b17..4ab8ddba7c1 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -1258,18 +1258,6 @@ build-postgresql:
        </listitem>
       </varlistentry>
 
-      <varlistentry id="configure-option-disable-atomics">
-       <term><option>--disable-atomics</option></term>
-       <listitem>
-        <para>
-         Disable use of CPU atomic operations.  This option does nothing on
-         platforms that lack such operations.  On platforms that do have
-         them, this will result in poor performance.  This option is only
-         useful for debugging or making performance comparisons.
-        </para>
-       </listitem>
-      </varlistentry>
-
      </variablelist>
 
    </sect3>
@@ -2674,19 +2662,6 @@ ninja install
       </listitem>
      </varlistentry>
 
-     <varlistentry id="configure-atomics-meson">
-      <term><option>-Datomics={ true | false }</option></term>
-      <listitem>
-       <para>
-        This option is set to true by default; setting it to false will
-        disable use of CPU atomic operations.  The option does nothing on
-        platforms that lack such operations.  On platforms that do have
-        them, disabling atomics will result in poor performance.  Changing
-        this option is only useful for debugging or making performance comparisons.
-       </para>
-      </listitem>
-     </varlistentry>
-
     </variablelist>
    </sect3>
 
diff --git a/meson.build b/meson.build
index 0d569e7a240..a382119528c 100644
--- a/meson.build
+++ b/meson.build
@@ -1983,70 +1983,61 @@ endif
 # Atomics
 ###############################################################
 
-if not get_option('atomics')
-  warning('Not using atomics will cause poor performance')
-else
-  # XXX: perhaps we should require some atomics support in this case these
-  # days?
-  cdata.set('HAVE_ATOMICS', 1)
-
-  atomic_checks = [
-    {'name': 'HAVE_GCC__SYNC_CHAR_TAS',
-     'desc': '__sync_lock_test_and_set(char)',
-     'test': '''
+atomic_checks = [
+  {'name': 'HAVE_GCC__SYNC_CHAR_TAS',
+   'desc': '__sync_lock_test_and_set(char)',
+   'test': '''
 char lock = 0;
 __sync_lock_test_and_set(&lock, 1);
 __sync_lock_release(&lock);'''},
 
-    {'name': 'HAVE_GCC__SYNC_INT32_TAS',
-     'desc': '__sync_lock_test_and_set(int32)',
-     'test': '''
+  {'name': 'HAVE_GCC__SYNC_INT32_TAS',
+   'desc': '__sync_lock_test_and_set(int32)',
+   'test': '''
 int lock = 0;
 __sync_lock_test_and_set(&lock, 1);
 __sync_lock_release(&lock);'''},
 
-    {'name': 'HAVE_GCC__SYNC_INT32_CAS',
-     'desc': '__sync_val_compare_and_swap(int32)',
-     'test': '''
+  {'name': 'HAVE_GCC__SYNC_INT32_CAS',
+   'desc': '__sync_val_compare_and_swap(int32)',
+   'test': '''
 int val = 0;
 __sync_val_compare_and_swap(&val, 0, 37);'''},
 
-    {'name': 'HAVE_GCC__SYNC_INT64_CAS',
-     'desc': '__sync_val_compare_and_swap(int64)',
-     'test': '''
+  {'name': 'HAVE_GCC__SYNC_INT64_CAS',
+   'desc': '__sync_val_compare_and_swap(int64)',
+   'test': '''
 INT64 val = 0;
 __sync_val_compare_and_swap(&val, 0, 37);'''},
 
-    {'name': 'HAVE_GCC__ATOMIC_INT32_CAS',
-     'desc': ' __atomic_compare_exchange_n(int32)',
-     'test': '''
+  {'name': 'HAVE_GCC__ATOMIC_INT32_CAS',
+   'desc': ' __atomic_compare_exchange_n(int32)',
+   'test': '''
 int val = 0;
 int expect = 0;
 __atomic_compare_exchange_n(&val, &expect, 37, 0, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED);'''},
 
-    {'name': 'HAVE_GCC__ATOMIC_INT64_CAS',
-     'desc': ' __atomic_compare_exchange_n(int64)',
-     'test': '''
+  {'name': 'HAVE_GCC__ATOMIC_INT64_CAS',
+   'desc': ' __atomic_compare_exchange_n(int64)',
+   'test': '''
 INT64 val = 0;
 INT64 expect = 0;
 __atomic_compare_exchange_n(&val, &expect, 37, 0, __ATOMIC_SEQ_CST, __ATOMIC_RELAXED);'''},
-  ]
+]
 
-  foreach check : atomic_checks
-    test = '''
+foreach check : atomic_checks
+  test = '''
 int main(void)
 {
 @0@
 }'''.format(check['test'])
 
-    cdata.set(check['name'],
-      cc.links(test,
-        name: check['desc'],
-        args: test_c_args + ['-DINT64=@0@'.format(cdata.get('PG_INT64_TYPE'))]) ? 1 : false
-    )
-  endforeach
-
-endif
+  cdata.set(check['name'],
+    cc.links(test,
+      name: check['desc'],
+      args: test_c_args + ['-DINT64=@0@'.format(cdata.get('PG_INT64_TYPE'))]) ? 1 : false
+  )
+endforeach
 
 
 ###############################################################
diff --git a/src/backend/port/atomics.c b/src/backend/port/atomics.c
index cd7ede96726..6f1e014d0b8 100644
--- a/src/backend/port/atomics.c
+++ b/src/backend/port/atomics.c
@@ -49,115 +49,6 @@ pg_extern_compiler_barrier(void)
 #endif
 
 
-#ifdef PG_HAVE_ATOMIC_FLAG_SIMULATION
-
-void
-pg_atomic_init_flag_impl(volatile pg_atomic_flag *ptr)
-{
-	StaticAssertDecl(sizeof(ptr->sema) >= sizeof(slock_t),
-					 "size mismatch of atomic_flag vs slock_t");
-
-	SpinLockInit((slock_t *) &ptr->sema);
-
-	ptr->value = false;
-}
-
-bool
-pg_atomic_test_set_flag_impl(volatile pg_atomic_flag *ptr)
-{
-	uint32		oldval;
-
-	SpinLockAcquire((slock_t *) &ptr->sema);
-	oldval = ptr->value;
-	ptr->value = true;
-	SpinLockRelease((slock_t *) &ptr->sema);
-
-	return oldval == 0;
-}
-
-void
-pg_atomic_clear_flag_impl(volatile pg_atomic_flag *ptr)
-{
-	SpinLockAcquire((slock_t *) &ptr->sema);
-	ptr->value = false;
-	SpinLockRelease((slock_t *) &ptr->sema);
-}
-
-bool
-pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr)
-{
-	return ptr->value == 0;
-}
-
-#endif							/* PG_HAVE_ATOMIC_FLAG_SIMULATION */
-
-#ifdef PG_HAVE_ATOMIC_U32_SIMULATION
-void
-pg_atomic_init_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 val_)
-{
-	StaticAssertDecl(sizeof(ptr->sema) >= sizeof(slock_t),
-					 "size mismatch of atomic_uint32 vs slock_t");
-
-	SpinLockInit((slock_t *) &ptr->sema);
-	ptr->value = val_;
-}
-
-void
-pg_atomic_write_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 val)
-{
-	/*
-	 * One might think that an unlocked write doesn't need to acquire the
-	 * spinlock, but one would be wrong. Even an unlocked write has to cause a
-	 * concurrent pg_atomic_compare_exchange_u32() (et al) to fail.
-	 */
-	SpinLockAcquire((slock_t *) &ptr->sema);
-	ptr->value = val;
-	SpinLockRelease((slock_t *) &ptr->sema);
-}
-
-bool
-pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
-									uint32 *expected, uint32 newval)
-{
-	bool		ret;
-
-	/*
-	 * Do atomic op under a spinlock. It might look like we could just skip
-	 * the cmpxchg if the lock isn't available, but that'd just emulate a
-	 * 'weak' compare and swap. I.e. one that allows spurious failures. Since
-	 * several algorithms rely on a strong variant and that is efficiently
-	 * implementable on most major architectures let's emulate it here as
-	 * well.
-	 */
-	SpinLockAcquire((slock_t *) &ptr->sema);
-
-	/* perform compare/exchange logic */
-	ret = ptr->value == *expected;
-	*expected = ptr->value;
-	if (ret)
-		ptr->value = newval;
-
-	/* and release lock */
-	SpinLockRelease((slock_t *) &ptr->sema);
-
-	return ret;
-}
-
-uint32
-pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_)
-{
-	uint32		oldval;
-
-	SpinLockAcquire((slock_t *) &ptr->sema);
-	oldval = ptr->value;
-	ptr->value += add_;
-	SpinLockRelease((slock_t *) &ptr->sema);
-	return oldval;
-}
-
-#endif							/* PG_HAVE_ATOMIC_U32_SIMULATION */
-
-
 #ifdef PG_HAVE_ATOMIC_U64_SIMULATION
 
 void
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index ed616278b14..bf57690af5b 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -57,9 +57,6 @@
 /* Define to 1 if you have the `ASN1_STRING_get0_data' function. */
 #undef HAVE_ASN1_STRING_GET0_DATA
 
-/* Define to 1 if you want to use atomics if available. */
-#undef HAVE_ATOMICS
-
 /* Define to 1 if you have the <atomic.h> header file. */
 #undef HAVE_ATOMIC_H
 
diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h
index f92a9f62ba4..a6360f021fd 100644
--- a/src/include/port/atomics.h
+++ b/src/include/port/atomics.h
@@ -17,7 +17,7 @@
  * There exist generic, hardware independent, implementations for several
  * compilers which might be sufficient, although possibly not optimal, for a
  * new platform. If no such generic implementation is available spinlocks will
- * be used to implement the API.
+ * be used to implement the 64-bit parts of the API.
  *
  * Implement _u64 atomics if and only if your platform can use them
  * efficiently (and obviously correctly).
@@ -91,10 +91,7 @@
 #elif defined(__SUNPRO_C) && !defined(__GNUC__)
 #include "port/atomics/generic-sunpro.h"
 #else
-/*
- * Unsupported compiler, we'll likely use slower fallbacks... At least
- * compiler barriers should really be provided.
- */
+#error "unknown compiler, so required atomic type support is missing"
 #endif
 
 /*
diff --git a/src/include/port/atomics/arch-x86.h b/src/include/port/atomics/arch-x86.h
index 3efa79dc3df..30e4c8ea697 100644
--- a/src/include/port/atomics/arch-x86.h
+++ b/src/include/port/atomics/arch-x86.h
@@ -49,8 +49,6 @@
  * nice to support older gcc's and the compare/exchange implementation here is
  * actually more efficient than the * __sync variant.
  */
-#if defined(HAVE_ATOMICS)
-
 #if defined(__GNUC__) || defined(__INTEL_COMPILER)
 
 #define PG_HAVE_ATOMIC_FLAG_SUPPORT
@@ -80,8 +78,6 @@ typedef struct pg_atomic_uint64
 
 #endif /* defined(__GNUC__) || defined(__INTEL_COMPILER) */
 
-#endif /* defined(HAVE_ATOMICS) */
-
 #if !defined(PG_HAVE_SPIN_DELAY)
 /*
  * This sequence is equivalent to the PAUSE instruction ("rep" is
@@ -132,8 +128,6 @@ pg_spin_delay_impl(void)
 #endif /* !defined(PG_HAVE_SPIN_DELAY) */
 
 
-#if defined(HAVE_ATOMICS)
-
 #if defined(__GNUC__) || defined(__INTEL_COMPILER)
 
 #define PG_HAVE_ATOMIC_TEST_SET_FLAG
@@ -248,5 +242,3 @@ pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_)
 	defined(__x86_64__) || defined(__x86_64) || defined(_M_X64) /* gcc, sunpro, msvc */
 #define PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY
 #endif /* 8 byte single-copy atomicity */
-
-#endif /* HAVE_ATOMICS */
diff --git a/src/include/port/atomics/fallback.h b/src/include/port/atomics/fallback.h
index 34cfee110fb..d169f675b07 100644
--- a/src/include/port/atomics/fallback.h
+++ b/src/include/port/atomics/fallback.h
@@ -1,7 +1,7 @@
 /*-------------------------------------------------------------------------
  *
  * fallback.h
- *    Fallback for platforms without spinlock and/or atomics support. Slower
+ *    Fallback for platforms without 64 bit atomics support. Slower
  *    than native atomics support, but not unusably slow.
  *
  * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
@@ -51,50 +51,6 @@ extern void pg_extern_compiler_barrier(void);
 #endif
 
 
-/*
- * If we have atomics implementation for this platform, fall back to providing
- * the atomics API using a spinlock to protect the internal state. Possibly
- * the spinlock implementation uses semaphores internally...
- *
- * We have to be a bit careful here, as it's not guaranteed that atomic
- * variables are mapped to the same address in every process (e.g. dynamic
- * shared memory segments). We can't just hash the address and use that to map
- * to a spinlock. Instead assign a spinlock on initialization of the atomic
- * variable.
- */
-#if !defined(PG_HAVE_ATOMIC_FLAG_SUPPORT) && !defined(PG_HAVE_ATOMIC_U32_SUPPORT)
-
-#define PG_HAVE_ATOMIC_FLAG_SIMULATION
-#define PG_HAVE_ATOMIC_FLAG_SUPPORT
-
-typedef struct pg_atomic_flag
-{
-	/*
-	 * To avoid circular includes we can't use s_lock as a type here. Instead
-	 * just reserve enough space for all spinlock types. Some platforms would
-	 * be content with just one byte instead of 4, but that's not too much
-	 * waste.
-	 */
-	int			sema;
-	volatile bool value;
-} pg_atomic_flag;
-
-#endif /* PG_HAVE_ATOMIC_FLAG_SUPPORT */
-
-#if !defined(PG_HAVE_ATOMIC_U32_SUPPORT)
-
-#define PG_HAVE_ATOMIC_U32_SIMULATION
-
-#define PG_HAVE_ATOMIC_U32_SUPPORT
-typedef struct pg_atomic_uint32
-{
-	/* Check pg_atomic_flag's definition above for an explanation */
-	int			sema;
-	volatile uint32 value;
-} pg_atomic_uint32;
-
-#endif /* PG_HAVE_ATOMIC_U32_SUPPORT */
-
 #if !defined(PG_HAVE_ATOMIC_U64_SUPPORT)
 
 #define PG_HAVE_ATOMIC_U64_SIMULATION
@@ -102,49 +58,10 @@ typedef struct pg_atomic_uint32
 #define PG_HAVE_ATOMIC_U64_SUPPORT
 typedef struct pg_atomic_uint64
 {
-	/* Check pg_atomic_flag's definition above for an explanation */
 	int			sema;
 	volatile uint64 value;
 } pg_atomic_uint64;
 
-#endif /* PG_HAVE_ATOMIC_U64_SUPPORT */
-
-#ifdef PG_HAVE_ATOMIC_FLAG_SIMULATION
-
-#define PG_HAVE_ATOMIC_INIT_FLAG
-extern void pg_atomic_init_flag_impl(volatile pg_atomic_flag *ptr);
-
-#define PG_HAVE_ATOMIC_TEST_SET_FLAG
-extern bool pg_atomic_test_set_flag_impl(volatile pg_atomic_flag *ptr);
-
-#define PG_HAVE_ATOMIC_CLEAR_FLAG
-extern void pg_atomic_clear_flag_impl(volatile pg_atomic_flag *ptr);
-
-#define PG_HAVE_ATOMIC_UNLOCKED_TEST_FLAG
-extern bool pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr);
-
-#endif /* PG_HAVE_ATOMIC_FLAG_SIMULATION */
-
-#ifdef PG_HAVE_ATOMIC_U32_SIMULATION
-
-#define PG_HAVE_ATOMIC_INIT_U32
-extern void pg_atomic_init_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 val_);
-
-#define PG_HAVE_ATOMIC_WRITE_U32
-extern void pg_atomic_write_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 val);
-
-#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32
-extern bool pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr,
-												uint32 *expected, uint32 newval);
-
-#define PG_HAVE_ATOMIC_FETCH_ADD_U32
-extern uint32 pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_);
-
-#endif /* PG_HAVE_ATOMIC_U32_SIMULATION */
-
-
-#ifdef PG_HAVE_ATOMIC_U64_SIMULATION
-
 #define PG_HAVE_ATOMIC_INIT_U64
 extern void pg_atomic_init_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 val_);
 
@@ -155,4 +72,4 @@ extern bool pg_atomic_compare_exchange_u64_impl(volatile pg_atomic_uint64 *ptr,
 #define PG_HAVE_ATOMIC_FETCH_ADD_U64
 extern uint64 pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_);
 
-#endif /* PG_HAVE_ATOMIC_U64_SIMULATION */
+#endif /* PG_HAVE_ATOMIC_U64_SUPPORT */
diff --git a/src/include/port/atomics/generic-gcc.h b/src/include/port/atomics/generic-gcc.h
index 9d91370fa8c..f911cef65ab 100644
--- a/src/include/port/atomics/generic-gcc.h
+++ b/src/include/port/atomics/generic-gcc.h
@@ -53,8 +53,6 @@
 #endif
 
 
-#ifdef HAVE_ATOMICS
-
 /* generic gcc based atomic flag implementation */
 #if !defined(PG_HAVE_ATOMIC_FLAG_SUPPORT) \
 	&& (defined(HAVE_GCC__SYNC_INT32_TAS) || defined(HAVE_GCC__SYNC_CHAR_TAS))
@@ -316,5 +314,3 @@ pg_atomic_fetch_or_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 or_)
 #endif
 
 #endif /* !defined(PG_DISABLE_64_BIT_ATOMICS) */
-
-#endif /* defined(HAVE_ATOMICS) */
diff --git a/src/include/port/atomics/generic-msvc.h b/src/include/port/atomics/generic-msvc.h
index c013aca5e7c..677436f2601 100644
--- a/src/include/port/atomics/generic-msvc.h
+++ b/src/include/port/atomics/generic-msvc.h
@@ -30,8 +30,6 @@
 #define pg_memory_barrier_impl()	MemoryBarrier()
 #endif
 
-#if defined(HAVE_ATOMICS)
-
 #define PG_HAVE_ATOMIC_U32_SUPPORT
 typedef struct pg_atomic_uint32
 {
@@ -115,5 +113,3 @@ pg_atomic_fetch_add_u64_impl(volatile pg_atomic_uint64 *ptr, int64 add_)
 }
 
 #endif /* _WIN64 */
-
-#endif /* HAVE_ATOMICS */
diff --git a/src/include/port/atomics/generic-sunpro.h b/src/include/port/atomics/generic-sunpro.h
index e060c0868a9..aa81ab0df89 100644
--- a/src/include/port/atomics/generic-sunpro.h
+++ b/src/include/port/atomics/generic-sunpro.h
@@ -17,8 +17,6 @@
  * -------------------------------------------------------------------------
  */
 
-#if defined(HAVE_ATOMICS)
-
 #ifdef HAVE_MBARRIER_H
 #include <mbarrier.h>
 
@@ -66,10 +64,6 @@ typedef struct pg_atomic_uint64
 
 #endif /* HAVE_ATOMIC_H */
 
-#endif /* defined(HAVE_ATOMICS) */
-
-
-#if defined(HAVE_ATOMICS)
 
 #ifdef HAVE_ATOMIC_H
 
@@ -116,5 +110,3 @@ pg_atomic_exchange_u64_impl(volatile pg_atomic_uint64 *ptr, uint64 newval)
 }
 
 #endif /* HAVE_ATOMIC_H */
-
-#endif /* defined(HAVE_ATOMICS) */
-- 
2.45.2

From 5260342a16abe91bd7c1b426423a4f2591cbda73 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Tue, 2 Jul 2024 22:31:42 +1200
Subject: [PATCH v1 3/3] Optionally do port/atomics.h with <stdatomic.h>.

Implement port/atomics.h's facilities directly using C11 standard
facilities, if available.  In practice, every modern system has it, but
we don't require C11 yet, so this is done only if a configure time check
finds the header.

XXX Not yet handled: pg_spin_delay().  But why should that be tangled up
with the atomics headers?
XXX Are the barriers or any other operations less efficient than the
hand-crafted stuff?
---
 configure                  |   2 +-
 configure.ac               |   1 +
 meson.build                |   1 +
 src/include/pg_config.h.in |   3 +
 src/include/port/atomics.h | 121 +++++++++++++++++++++++++++++++++++--
 5 files changed, 123 insertions(+), 5 deletions(-)

diff --git a/configure b/configure
index 4c73e46e246..77a18ce1c72 100755
--- a/configure
+++ b/configure
@@ -13257,7 +13257,7 @@ $as_echo "#define HAVE_STDBOOL_H 1" >>confdefs.h
 fi
 
 
-for ac_header in atomic.h copyfile.h execinfo.h getopt.h ifaddrs.h langinfo.h mbarrier.h sys/epoll.h sys/event.h sys/personality.h sys/prctl.h sys/procctl.h sys/signalfd.h sys/ucred.h termios.h ucred.h
+for ac_header in atomic.h copyfile.h execinfo.h getopt.h ifaddrs.h langinfo.h mbarrier.h stdatomic.h sys/epoll.h sys/event.h sys/personality.h sys/prctl.h sys/procctl.h sys/signalfd.h sys/ucred.h termios.h ucred.h
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default"
diff --git a/configure.ac b/configure.ac
index 8ee3d2cabb9..06ccb16085c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1440,6 +1440,7 @@ AC_CHECK_HEADERS(m4_normalize([
 	ifaddrs.h
 	langinfo.h
 	mbarrier.h
+	stdatomic.h
 	sys/epoll.h
 	sys/event.h
 	sys/personality.h
diff --git a/meson.build b/meson.build
index a382119528c..ce1062729da 100644
--- a/meson.build
+++ b/meson.build
@@ -2280,6 +2280,7 @@ header_checks = [
   'ifaddrs.h',
   'langinfo.h',
   'mbarrier.h',
+  'stdatomic.h',
   'stdbool.h',
   'strings.h',
   'sys/epoll.h',
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index bf57690af5b..150eefc666d 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -378,6 +378,9 @@
 /* Define to 1 if you have the `SSL_CTX_set_cert_cb' function. */
 #undef HAVE_SSL_CTX_SET_CERT_CB
 
+/* Define to 1 if you have the <stdatomic.h> header file. */
+#undef HAVE_STDATOMIC_H
+
 /* Define to 1 if stdbool.h conforms to C99. */
 #undef HAVE_STDBOOL_H
 
diff --git a/src/include/port/atomics.h b/src/include/port/atomics.h
index a6360f021fd..533c10a2106 100644
--- a/src/include/port/atomics.h
+++ b/src/include/port/atomics.h
@@ -3,9 +3,13 @@
  * atomics.h
  *	  Atomic operations.
  *
- * Hardware and compiler dependent functions for manipulating memory
- * atomically and dealing with cache coherency. Used to implement locking
- * facilities and lockless algorithms/data structures.
+ * If C11 <stdatomic.h> is available, this header just maps pg_XXX names onto
+ * the standard interfaces. Otherwise, for strict C99 environments, hardware-
+ * and compiler-dependent implementation functions are provided.
+ *
+ * These interfaces are for manipulating memory atomically and dealing with
+ * cache coherency. They can be used to implement locking facilities and
+ * lockless algorithms/data structures.
  *
  * To bring up postgres on a platform/compiler at the very least
  * implementations for the following operations should be provided:
@@ -46,6 +50,113 @@
 
 #include <limits.h>
 
+#ifdef HAVE_STDATOMIC_H
+
+/* Map pg_ atomic interfaces directly to standard C11 interfaces. */
+
+#include <stdatomic.h>
+
+/* Prevent compiler re-ordering and control memory ordering. */
+#define pg_memory_barrier_impl() atomic_thread_fence(memory_order_seq_cst)
+#define pg_read_barrier_impl()  atomic_thread_fence(memory_order_acquire)
+#define pg_write_barrier_impl() atomic_thread_fence(memory_order_release)
+
+/* Prevent compiler re-ordering, but don't generate any code. */
+#define pg_compiler_barrier_impl() atomic_signal_fence(memory_order_seq_cst)
+
+/*
+ * We don't map pg_atomic_flag to standard atomic_flag, because that can't
+ * implement pg_atomic_unlocked_test_flag()'s relaxed load.  So we'll just let
+ * generic.h provide an implementation on top of pg_atomic_uint32.
+ */
+
+/*
+ * For pg_atomic_uint32, we require a real lock-free uint32, not one that is
+ * emulated with locks by the compiler or runtime library.
+ */
+#if ATOMIC_INT_LOCK_FREE < 2
+#error atomic_uint is not always lock-free
+#endif
+typedef atomic_uint pg_atomic_uint32;
+#define PG_HAVE_ATOMIC_U32_SUPPORT
+#define pg_atomic_init_u32_impl(x, v) atomic_init((x), (v))
+#define PG_HAVE_ATOMIC_INIT_U32
+#define pg_atomic_read_u32_impl(x) *(x)
+#define PG_HAVE_ATOMIC_READ_U32
+#define pg_atomic_write_u32_impl(x, v) *(x) = (v)
+#define PG_HAVE_ATOMIC_WRITE_U32
+#define pg_atomic_unlocked_write_u32_impl(x, v) *(x) = (v)
+#define PG_HAVE_ATOMIC_UNLOCKED_WRITE_U32
+#define pg_atomic_exchange_u32_impl atomic_exchange
+#define PG_HAVE_ATOMIC_EXCHANGE_U32
+#define pg_atomic_compare_exchange_u32_impl atomic_compare_exchange_strong
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U32
+#define pg_atomic_fetch_add_u32_impl atomic_fetch_add
+#define PG_HAVE_ATOMIC_FETCH_ADD_U32
+#define pg_atomic_fetch_sub_u32_impl atomic_fetch_sub
+#define PG_HAVE_ATOMIC_FETCH_SUB_U32
+#define pg_atomic_fetch_or_u32_impl atomic_fetch_or
+#define PG_HAVE_ATOMIC_FETCH_OR_U32
+#define pg_atomic_fetch_and_u32_impl atomic_fetch_and
+#define PG_HAVE_ATOMIC_FETCH_AND_U32
+#define pg_atomic_fetch_xor_u32_impl atomic_fetch_xor
+#define PG_HAVE_ATOMIC_FETCH_XOR_U32
+
+/*
+ * Does this system also have a 64 bit atomic type that is lock-free?  All
+ * modern systems should, but if not, we'll supply our own lock-based
+ * emulation in fallback.h instead of relying on libc's lock-based emulation.
+ * That reduces the number of possible combinations of behavior on rare
+ * systems.
+ */
+#if defined(DEBUG_NO_ATOMIC_64)
+/* developer-only macro used to force fallback code to be used */
+#elif SIZEOF_LONG == 8 && ATOMIC_LONG_LOCK_FREE > 1
+typedef atomic_ulong pg_atomic_uint64;
+#define PG_HAVE_ATOMIC_U64_SUPPORT
+#elif ATOMIC_LONG_LONG_LOCK_FREE > 1
+typedef atomic_ulonglong pg_atomic_uint64;
+#define PG_HAVE_ATOMIC_U64_SUPPORT
+#endif
+
+#ifdef PG_HAVE_ATOMIC_U64_SUPPORT
+#define pg_atomic_init_u64_impl(x, v) atomic_init((x), (v))
+#define PG_HAVE_ATOMIC_INIT_U64
+#define pg_atomic_read_u64_impl(x) *(x)
+#define PG_HAVE_ATOMIC_READ_U64
+#define pg_atomic_write_u64_impl(x, v) *(x) = (v)
+#define PG_HAVE_ATOMIC_WRITE_U64
+#define pg_atomic_unlocked_write_u64_impl(x, v) *(x) = (v)
+#define PG_HAVE_ATOMIC_UNLOCKED_WRITE_U64
+#define pg_atomic_exchange_u64_impl atomic_exchange
+#define PG_HAVE_ATOMIC_EXCHANGE_U64
+#define pg_atomic_compare_exchange_u64_impl atomic_compare_exchange_strong
+#define PG_HAVE_ATOMIC_COMPARE_EXCHANGE_U64
+#define pg_atomic_fetch_add_u64_impl atomic_fetch_add
+#define PG_HAVE_ATOMIC_FETCH_ADD_U64
+#define pg_atomic_fetch_sub_u64_impl atomic_fetch_sub
+#define PG_HAVE_ATOMIC_FETCH_SUB_U64
+#define pg_atomic_fetch_or_u64_impl atomic_fetch_or
+#define PG_HAVE_ATOMIC_FETCH_OR_U64
+#define pg_atomic_fetch_and_u64_impl atomic_fetch_and
+#define PG_HAVE_ATOMIC_FETCH_AND_U64
+#define pg_atomic_fetch_xor_u64_impl atomic_fetch_xor
+#define PG_HAVE_ATOMIC_FETCH_XOR_U64
+#endif
+
+/*
+ * XXX TODO: we need to get the pg_spin_delay_impl from arch-specific files,
+ * but we don't want anything else from them.  But really, why is that tangled
+ * up with atomics?
+ */
+
+#else
+
+/*
+ * This system doesn't have <stdatomic.h> yet, so we'll use hand-rolled
+ * implementations using compiler- and architecture-specific knowledge.
+ */
+
 /*
  * First a set of architecture specific files is included.
  *
@@ -91,9 +202,11 @@
 #elif defined(__SUNPRO_C) && !defined(__GNUC__)
 #include "port/atomics/generic-sunpro.h"
 #else
-#error "unknown compiler, so required atomic type support is missing"
+#error "no <stdatomic.h> and unknown compiler, so required atomic type support is missing"
 #endif
 
+#endif /* !HAVE_STDATOMIC_H */
+
 /*
  * Provide a full fallback of the pg_*_barrier(), pg_atomic**_flag and
  * pg_atomic_* APIs for platforms without sufficient spinlock and/or atomics
-- 
2.45.2

Reply via email to