On Tue, 3 Jun 2025 at 09:46, Jonathan Wakely wrote:
>
> This adds a new implementation of std::counting_semaphore<Max> for the
> case where Max == 1, i.e. the std::binary_semaphore typedef. When the
> maximum counter value is 1 we don't need to load the current counter
> value before doing a compare-exchange to acquire the semaphore. We can
> just optimisitcally assume it's currently 1, and if that's true then the
> compare_exchange will succeed. This simplifies _M_try_acquire so that we
> don't need the separate _M_do_try_acquire function used by the general
> __semaphore_base implementation for _Max > 1 cases.
>
> We can also use the simpler forms of atomic waiting that just take a
> value instead of a value accessor and predicate, because we know that
> the _M_counter is always a __platform_wait_t. This change adds a
> bare_wait flag to __atomic_wait_address_v because we don't need to track
> waiters for semaphores, we only need to notify when a semaphore with a
> count of zero is released.
>
> I'm not sure if this makes the code any faster in real scenarios, but
> the generated code for std::binary_semaphore is slightly smaller now.
>
> libstdc++-v3/ChangeLog:
>
>         * include/bits/semaphore_base.h (__binary_semaphore_impl): New
>         class with optimized implementation for std::binary_semaphore.
>         (__semaphore_impl) <_max == 1>: Modify alias template to use
>         __binary_semaphore_impl.
>         * include/bits/atomic_wait.h (__atomic_wait_address_v): Add
>         parameter for bare waits.
> ---
>
> Tested x86_64-linux.
>
>  libstdc++-v3/include/bits/atomic_wait.h    |  5 +-
>  libstdc++-v3/include/bits/semaphore_base.h | 73 +++++++++++++++++++++-
>  2 files changed, 73 insertions(+), 5 deletions(-)
>
> diff --git a/libstdc++-v3/include/bits/atomic_wait.h 
> b/libstdc++-v3/include/bits/atomic_wait.h
> index 815726c16ccb..9ae11191d9ab 100644
> --- a/libstdc++-v3/include/bits/atomic_wait.h
> +++ b/libstdc++-v3/include/bits/atomic_wait.h
> @@ -249,12 +249,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        // C++26 will return __val
>      }
>
> +  // Wait on __addr while *__addr == __old is true.
>    inline void
>    __atomic_wait_address_v(const __detail::__platform_wait_t* __addr,
>                           __detail::__platform_wait_t __old,
> -                         int __order)
> +                         int __order, bool __bare_wait = false)
>    {
> -    __detail::__wait_args __args{ __addr, __old, __order };
> +    __detail::__wait_args __args{ __addr, __old, __order, __bare_wait };
>      // C++26 will not ignore the return value here
>      __detail::__wait_impl(__addr, __args);

This crashes here on Solaris, this function doesn't work for proxy
waits (i.e. targets without futex). It needs to use
__args._M_setup_wait to get the proxy wait state, or needs the
additional fix pasted below. Patches 3/4 and 4/4 depend on that fix.

Patches 1/4 (which fixes the deadlock bug) and 2/4 (which is not
essential but IMHO makes the code simpler) are still OK and pass all
tests on Solaris without this (because only 3/4 introduces uses of the
__atomic_wait_address_v overload that doesn't support proxy waits).


--- a/libstdc++-v3/include/bits/atomic_wait.h
+++ b/libstdc++-v3/include/bits/atomic_wait.h
@@ -249,14 +249,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
      // C++26 will return __val
    }

+  // Wait on __addr while *__addr == __old is true.
  inline void
  __atomic_wait_address_v(const __detail::__platform_wait_t* __addr,
                         __detail::__platform_wait_t __old,
-                         int __order)
+                         int __order, bool __bare_wait = false)
  {
-    __detail::__wait_args __args{ __addr, __old, __order };
+    // If we have a platform wait function (e.g. futex wait) that atomically
+    // loads *__addr and compares it to __args._M_old then we can use that.
+    // Otherwise we need to use an accessor function to do that load and a
+    // predicate function to do the comparison.
+#if _GLIBCXX_HAVE_PLATFORM_WAIT
+    __detail::__wait_args __args{ __addr, __old, __order, __bare_wait };
    // C++26 will not ignore the return value here
    __detail::__wait_impl(__addr, __args);
+#else
+    auto __vfn = [&__addr] { return __atomic_load_n(__addr, __order); };
+    auto __pfn = [&__old](const __detail::__platform_wait_t& __val) {
+      return !__detail::__atomic_eq(__old, __val);
+    };
+    return std::__atomic_wait_address(__addr, __pfn, __vfn, __bare_wait);
+#endif
  }

  // Wait on __addr while __vfn() == __old is true.

Reply via email to