This moves the implementation details of atomic wait/notify functions into the library, so that only a small API surface is exposed to users.
This also fixes some race conditions present in the design for proxied waits: - The stores to _M_ver in __notify_impl must be protected by the mutex, and the loads from _M_ver in __wait_impl and __wait_until_impl to check for changes must also be protected by the mutex. This ensures that checking _M_ver for updates and waiting on the condition_variable happens atomically. Otherwise it's possible to have: _M_ver == old happens-before {++_M_ver; cv.notify;} which happens-before cv.wait. That scenario results in a missed notification, and so the waiting function never wakes. This wasn't a problem for Linux, because the futex wait call re-checks the _M_ver value before sleeping, so the increment cannot interleave between the check and the wait. - The initial load from _M_ver that reads the 'old' value used for the _M_ver == old checks must be done before loading and checking the value of the atomic variable. Otherwise it's possible to have: var.load() == val happens-before {++_M_ver; _M_cv.notify_all();} happens-before {old = _M_ver; lock mutex; if (_M_ver == old) cv.wait}. This results in the waiting thread seeing the already-incremented value of _M_ver and then waiting for it to change again, which doesn't happen. This race was present even for Linux, because using a futex instead of mutex+condvar doesn't prevent the increment from happening before the waiting threads checks for the increment. The first race can be solved locally in the waiting and notifying functions, by acquiring the mutex lock earlier in the function. The second race cannot be fixed locally, because the load of the atomic variable and the check for updates to _M_ver happen in different functions (one in a function template in the headers and one in the library). We do have an _M_old data member in the __wait_args_base struct which was previously only used for non-proxy waits using a futex. We can add a new entry point into the library to look up the waitable state for the address and then load its _M_ver into the _M_old member. This allows the inline function template to ensure that loading _M_ver happens-before testing whether the atomic variable has been changed, so that we can reliably tell if _M_ver changes after we've already tested the atomic variable. This isn't 100% reliable, because _M_ver could be incremented 2^32 times and wrap back to the same value, but that seems unlikely in practice. If/when we support waiting on user-defined predicates (which could execute long enough for _M_ver to wrap) we might want to always wait with a timeout, so that we get a chance to re-check the predicate even in the rare case that _M_ver wraps. Another change is to make the __wait_until_impl function take a __wait_clock_t::duration instead of a __wait_clock_t::time_point, so that the __wait_until_impl function doesn't depend on the symbol name of chrono::steady_clock. Inside the library it can be converted back to a time_point for the clock. This would potentially allow using a different clock, if we made a different __abi_version in the __wait_args imply waiting with a different clock. This also adds a void* to the __wait_args_base structure, so that __wait_impl can store the __waitable_state* in there the first time it's looked up for a given wait, so that it doesn't need to be retrieved again on each loop. This requires passing the __wait_args_base structure by non-const reference. The __waitable_state::_S_track function can be removed now that it's all internal to the library, and namespace-scope RAII types added for locking and tracking contention. libstdc++-v3/ChangeLog: * config/abi/pre/gnu.ver: Add new symbol version and exports. * include/bits/atomic_timed_wait.h (__platform_wait_until): Move to atomic.cc. (__cond_wait_until, __spin_until_impl): Likewise. (__wait_until_impl): Likewise. Change __wait_args_base parameter to non-const reference and change third parameter to __wait_clock_t::duration. (__wait_until): Change __wait_args_base parameter to non-const reference. Change Call time_since_epoch() to get duration from time_point. (__wait_for): Change __wait_args_base parameter to non-const reference. (__atomic_wait_address_until): Call _M_prep_for_wait_on on args. (__atomic_wait_address_for): Likewise. (__atomic_wait_address_until_v): Qualify call to avoid ADL. Do not forward __vfn. * include/bits/atomic_wait.h (__platform_wait_uses_type): Use alignof(T) not alignof(T*). (__futex_wait_flags, __platform_wait, __platform_notify) (__waitable_state, __spin_impl, __notify_impl): Move to atomic.cc. (__wait_impl): Likewise. Change __wait_args_base parameter to non-const reference. (__wait_args_base::_M_wait_state): New data member. (__wait_args_base::_M_prep_for_wait_on): New member function. (__wait_args_base::_M_load_proxy_wait_val): New member function. (__wait_args_base::_S_memory_order_for): Remove member function. (__atomic_wait_address): Call _M_prep_for_wait_on on args. (__atomic_wait_address_v): Qualify call to avoid ADL. * src/c++20/Makefile.am: Add new file. * src/c++20/Makefile.in: Regenerate. * src/c++20/atomic.cc: New file. * testsuite/17_intro/headers/c++1998/49745.cc: Remove XFAIL for C++20 and later. * testsuite/29_atomics/atomic/wait_notify/100334.cc: Remove use of internal implementation details. * testsuite/util/testsuite_abi.cc: Add GLIBCXX_3.4.35 version. --- libstdc++-v3/config/abi/pre/gnu.ver | 11 + libstdc++-v3/include/bits/atomic_timed_wait.h | 164 +----- libstdc++-v3/include/bits/atomic_wait.h | 312 ++---------- libstdc++-v3/src/c++20/Makefile.am | 2 +- libstdc++-v3/src/c++20/Makefile.in | 4 +- libstdc++-v3/src/c++20/atomic.cc | 468 ++++++++++++++++++ .../17_intro/headers/c++1998/49745.cc | 2 - .../29_atomics/atomic/wait_notify/100334.cc | 2 + libstdc++-v3/testsuite/util/testsuite_abi.cc | 1 + 9 files changed, 538 insertions(+), 428 deletions(-) create mode 100644 libstdc++-v3/src/c++20/atomic.cc diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver index 29bc7d86256e..c36f1c347675 100644 --- a/libstdc++-v3/config/abi/pre/gnu.ver +++ b/libstdc++-v3/config/abi/pre/gnu.ver @@ -2544,8 +2544,19 @@ GLIBCXX_3.4.34 { # void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<bool>(char const*, size_t) # and wide char version _ZNSt7__cxx1112basic_stringI[cw]St11char_traitsI[cw]ESaI[cw]EE12_M_constructILb[01]EEEvPK[cw][jmy]; + } GLIBCXX_3.4.33; +# GCC 16.1.0 +GLIBCXX_3.4.35 { + + _ZNSt8__detail11__wait_implEPKvRNS_16__wait_args_baseE; + _ZNSt8__detail13__notify_implEPKvbRKNS_16__wait_args_baseE; + _ZNSt8__detail17__wait_until_implEPKvRNS_16__wait_args_baseERKNSt6chrono8durationI[lx]St5ratioIL[lx]1EL[lx]1000000000EEEE; + _ZNSt8__detail11__wait_args22_M_load_proxy_wait_valEPKv; + +} GLIBCXX_3.4.34; + # Symbols in the support library (libsupc++) have their own tag. CXXABI_1.3 { diff --git a/libstdc++-v3/include/bits/atomic_timed_wait.h b/libstdc++-v3/include/bits/atomic_timed_wait.h index 19a0225c63b2..3e25607b7d4c 100644 --- a/libstdc++-v3/include/bits/atomic_timed_wait.h +++ b/libstdc++-v3/include/bits/atomic_timed_wait.h @@ -37,7 +37,6 @@ #include <bits/atomic_wait.h> #if __glibcxx_atomic_wait -#include <bits/functional_hash.h> #include <bits/this_thread_sleep.h> #include <bits/chrono.h> @@ -78,154 +77,25 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #ifdef _GLIBCXX_HAVE_LINUX_FUTEX #define _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT - // returns true if wait ended before timeout - bool - __platform_wait_until(const __platform_wait_t* __addr, - __platform_wait_t __old, - const __wait_clock_t::time_point& __atime) noexcept - { - auto __s = chrono::time_point_cast<chrono::seconds>(__atime); - auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s); - - struct timespec __rt = - { - static_cast<std::time_t>(__s.time_since_epoch().count()), - static_cast<long>(__ns.count()) - }; - - auto __e = syscall (SYS_futex, __addr, - static_cast<int>(__futex_wait_flags::__wait_bitset_private), - __old, &__rt, nullptr, - static_cast<int>(__futex_wait_flags::__bitset_match_any)); - if (__e) - { - if (errno == ETIMEDOUT) - return false; - if (errno != EINTR && errno != EAGAIN) - __throw_system_error(errno); - } - return true; - } #else // define _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT and implement __platform_wait_until // if there is a more efficient primitive supported by the platform // (e.g. __ulock_wait) which is better than pthread_cond_clockwait. #endif // ! HAVE_LINUX_FUTEX -#ifdef _GLIBCXX_HAS_GTHREADS - // Returns true if wait ended before timeout. - inline bool - __cond_wait_until(__condvar& __cv, mutex& __mx, - const __wait_clock_t::time_point& __atime) - { - auto __s = chrono::time_point_cast<chrono::seconds>(__atime); - auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s); - - __gthread_time_t __ts = - { - static_cast<std::time_t>(__s.time_since_epoch().count()), - static_cast<long>(__ns.count()) - }; - -#ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT - if constexpr (is_same_v<chrono::steady_clock, __wait_clock_t>) - __cv.wait_until(__mx, CLOCK_MONOTONIC, __ts); - else -#endif - __cv.wait_until(__mx, __ts); - return __wait_clock_t::now() < __atime; - } -#endif // _GLIBCXX_HAS_GTHREADS - - inline __wait_result_type - __spin_until_impl(const __platform_wait_t* __addr, - const __wait_args_base& __args, - const __wait_clock_t::time_point& __deadline) - { - auto __t0 = __wait_clock_t::now(); - using namespace literals::chrono_literals; - - __platform_wait_t __val{}; - auto __now = __wait_clock_t::now(); - for (; __now < __deadline; __now = __wait_clock_t::now()) - { - auto __elapsed = __now - __t0; -#ifndef _GLIBCXX_NO_SLEEP - if (__elapsed > 128ms) - this_thread::sleep_for(64ms); - else if (__elapsed > 64us) - this_thread::sleep_for(__elapsed / 2); - else -#endif - if (__elapsed > 4us) - __thread_yield(); - else if (auto __res = __detail::__spin_impl(__addr, __args); __res.first) - return __res; - - __atomic_load(__addr, &__val, __args._M_order); - if (__val != __args._M_old) - return { true, __val }; - } - return { false, __val }; - } - - inline __wait_result_type - __wait_until_impl(const void* __addr, const __wait_args_base& __a, - const __wait_clock_t::time_point& __atime) - { - __wait_args_base __args = __a; - __waitable_state* __state = nullptr; - const __platform_wait_t* __wait_addr; - if (__args & __wait_flags::__proxy_wait) - { - __state = &__waitable_state::_S_state_for(__addr); - __wait_addr = &__state->_M_ver; - __atomic_load(__wait_addr, &__args._M_old, __args._M_order); - } - else - __wait_addr = static_cast<const __platform_wait_t*>(__addr); - - if (__args & __wait_flags::__do_spin) - { - auto __res = __detail::__spin_until_impl(__wait_addr, __args, __atime); - if (__res.first) - return __res; - if (__args & __wait_flags::__spin_only) - return __res; - } - - auto __tracker = __waitable_state::_S_track(__state, __args, __addr); - -#ifdef _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT - if (__platform_wait_until(__wait_addr, __args._M_old, __atime)) - return { true, __args._M_old }; - else - return { false, __args._M_old }; -#else - __platform_wait_t __val{}; - __atomic_load(__wait_addr, &__val, __args._M_order); - if (__val == __args._M_old) - { - if (!__state) - __state = &__waitable_state::_S_state_for(__addr); - lock_guard<mutex> __l{ __state->_M_mtx }; - __atomic_load(__wait_addr, &__val, __args._M_order); - if (__val == __args._M_old - && __cond_wait_until(__state->_M_cv, __state->_M_mtx, __atime)) - return { true, __val }; - } - return { false, __val }; -#endif - } + __wait_result_type + __wait_until_impl(const void* __addr, __wait_args_base& __args, + const __wait_clock_t::duration& __atime); // Returns {true, val} if wait ended before a timeout. template<typename _Clock, typename _Dur> __wait_result_type - __wait_until(const void* __addr, const __wait_args_base& __args, + __wait_until(const void* __addr, __wait_args_base& __args, const chrono::time_point<_Clock, _Dur>& __atime) noexcept { auto __at = __detail::__to_wait_clock(__atime); - auto __res = __detail::__wait_until_impl(__addr, __args, __at); + auto __res = __detail::__wait_until_impl(__addr, __args, + __at.time_since_epoch()); if constexpr (!is_same_v<__wait_clock_t, _Clock>) if (!__res.first) @@ -242,15 +112,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // Returns {true, val} if wait ended before a timeout. template<typename _Rep, typename _Period> __wait_result_type - __wait_for(const void* __addr, const __wait_args_base& __args, + __wait_for(const void* __addr, __wait_args_base& __args, const chrono::duration<_Rep, _Period>& __rtime) noexcept { if (!__rtime.count()) { - __wait_args_base __a = __args; // no rtime supplied, just spin a bit - __a._M_flags |= __wait_flags::__do_spin | __wait_flags::__spin_only; - return __detail::__wait_impl(__addr, __a); + __args._M_flags |= __wait_flags::__do_spin | __wait_flags::__spin_only; + return __detail::__wait_impl(__addr, __args); } auto const __reltime = chrono::ceil<__wait_clock_t::duration>(__rtime); @@ -270,14 +139,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION bool __bare_wait = false) noexcept { __detail::__wait_args __args{ __addr, __bare_wait }; - _Tp __val = __vfn(); + _Tp __val = __args._M_prep_for_wait_on(__addr, __vfn); while (!__pred(__val)) { auto __res = __detail::__wait_until(__addr, __args, __atime); if (!__res.first) // timed out return __res.first; // C++26 will also return last observed __val - __val = __vfn(); + __val = __args._M_prep_for_wait_on(__addr, __vfn); } return true; // C++26 will also return last observed __val } @@ -298,15 +167,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION template<typename _Tp, typename _ValFn, typename _Clock, typename _Dur> bool - __atomic_wait_address_until_v(const _Tp* __addr, _Tp&& __old, _ValFn&& __vfn, + __atomic_wait_address_until_v(const _Tp* __addr, _Tp&& __old, + _ValFn&& __vfn, const chrono::time_point<_Clock, _Dur>& __atime, bool __bare_wait = false) noexcept { auto __pfn = [&](const _Tp& __val) { return !__detail::__atomic_eq(__old, __val); }; - return __atomic_wait_address_until(__addr, __pfn, forward<_ValFn>(__vfn), - __atime, __bare_wait); + return std::__atomic_wait_address_until(__addr, __pfn, __vfn, __atime, + __bare_wait); } template<typename _Tp, @@ -319,14 +189,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION bool __bare_wait = false) noexcept { __detail::__wait_args __args{ __addr, __bare_wait }; - _Tp __val = __vfn(); + _Tp __val = __args._M_prep_for_wait_on(__addr, __vfn); while (!__pred(__val)) { auto __res = __detail::__wait_for(__addr, __args, __rtime); if (!__res.first) // timed out return __res.first; // C++26 will also return last observed __val - __val = __vfn(); + __val = __args._M_prep_for_wait_on(__addr, __vfn); } return true; // C++26 will also return last observed __val } diff --git a/libstdc++-v3/include/bits/atomic_wait.h b/libstdc++-v3/include/bits/atomic_wait.h index bdc8677e9ea9..33e8d3202566 100644 --- a/libstdc++-v3/include/bits/atomic_wait.h +++ b/libstdc++-v3/include/bits/atomic_wait.h @@ -37,21 +37,10 @@ #include <bits/version.h> #if __glibcxx_atomic_wait -#include <cstdint> -#include <bits/functional_hash.h> #include <bits/gthr.h> #include <ext/numeric_traits.h> -#ifdef _GLIBCXX_HAVE_LINUX_FUTEX -# include <cerrno> -# include <climits> -# include <unistd.h> -# include <syscall.h> -# include <bits/functexcept.h> -#endif - #include <bits/stl_pair.h> -#include <bits/std_mutex.h> // std::mutex, std::__condvar namespace std _GLIBCXX_VISIBILITY(default) { @@ -82,55 +71,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #ifdef _GLIBCXX_HAVE_PLATFORM_WAIT = is_scalar_v<_Tp> && ((sizeof(_Tp) == sizeof(__detail::__platform_wait_t)) - && (alignof(_Tp*) >= __detail::__platform_wait_alignment)); + && (alignof(_Tp) >= __detail::__platform_wait_alignment)); #else = false; #endif namespace __detail { -#ifdef _GLIBCXX_HAVE_LINUX_FUTEX - enum class __futex_wait_flags : int - { -#ifdef _GLIBCXX_HAVE_LINUX_FUTEX_PRIVATE - __private_flag = 128, -#else - __private_flag = 0, -#endif - __wait = 0, - __wake = 1, - __wait_bitset = 9, - __wake_bitset = 10, - __wait_private = __wait | __private_flag, - __wake_private = __wake | __private_flag, - __wait_bitset_private = __wait_bitset | __private_flag, - __wake_bitset_private = __wake_bitset | __private_flag, - __bitset_match_any = -1 - }; - - // If the futex *__addr is equal to __val, wait on the futex until woken. - inline void - __platform_wait(const int* __addr, int __val) noexcept - { - auto __e = syscall (SYS_futex, __addr, - static_cast<int>(__futex_wait_flags::__wait_private), - __val, nullptr); - if (!__e || errno == EAGAIN) - return; - if (errno != EINTR) - __throw_system_error(errno); - } - - // Wake threads waiting on the futex *__addr. - inline void - __platform_notify(const int* __addr, bool __all) noexcept - { - syscall (SYS_futex, __addr, - static_cast<int>(__futex_wait_flags::__wake_private), - __all ? INT_MAX : 1); - } -#endif - inline void __thread_yield() noexcept { @@ -149,9 +96,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #endif } - inline constexpr auto __atomic_spin_count_relax = 12; - inline constexpr auto __atomic_spin_count = 16; - // return true if equal template<typename _Tp> inline bool @@ -161,65 +105,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return __builtin_memcmp(&__a, &__b, sizeof(_Tp)) == 0; } - struct __wait_args_base; - - // The state used by atomic waiting and notifying functions. - struct __waitable_state - { - // Don't use std::hardware_destructive_interference_size here because we - // don't want the layout of library types to depend on compiler options. - static constexpr auto _S_align = 64; - - // Count of threads blocked waiting on this state. - alignas(_S_align) __platform_wait_t _M_waiters = 0; - -#ifndef _GLIBCXX_HAVE_PLATFORM_WAIT - mutex _M_mtx; -#endif - - // If we can't do a platform wait on the atomic variable itself, - // we use this member as a proxy for the atomic variable and we - // use this for waiting and notifying functions instead. - alignas(_S_align) __platform_wait_t _M_ver = 0; - -#ifndef _GLIBCXX_HAVE_PLATFORM_WAIT - __condvar _M_cv; -#endif - - __waitable_state() = default; - - void - _M_enter_wait() noexcept - { __atomic_fetch_add(&_M_waiters, 1, __ATOMIC_SEQ_CST); } - - void - _M_leave_wait() noexcept - { __atomic_fetch_sub(&_M_waiters, 1, __ATOMIC_RELEASE); } - - bool - _M_waiting() const noexcept - { - __platform_wait_t __res; - __atomic_load(&_M_waiters, &__res, __ATOMIC_SEQ_CST); - return __res != 0; - } - - static __waitable_state& - _S_state_for(const void* __addr) noexcept - { - constexpr __UINTPTR_TYPE__ __ct = 16; - static __waitable_state __w[__ct]; - auto __key = ((__UINTPTR_TYPE__)__addr >> 2) % __ct; - return __w[__key]; - } - - // Return an RAII type that calls _M_enter_wait() on construction - // and _M_leave_wait() on destruction. - static auto - _S_track(__waitable_state*& __state, const __wait_args_base& __args, - const void* __addr) noexcept; - }; - enum class __wait_flags : __UINT_LEAST32_TYPE__ { __abi_version = 0, @@ -250,6 +135,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __wait_flags _M_flags; int _M_order = __ATOMIC_ACQUIRE; __platform_wait_t _M_old = 0; + void* _M_wait_state = nullptr; // Test whether _M_flags & __flags is non-zero. bool @@ -277,7 +163,33 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __wait_args(const __wait_args&) noexcept = default; __wait_args& operator=(const __wait_args&) noexcept = default; + template<typename _ValFn, + typename _Tp = decay_t<decltype(std::declval<_ValFn&>()())>> + _Tp + _M_prep_for_wait_on(const void* __addr, _ValFn __vfn) + { + if constexpr (__platform_wait_uses_type<_Tp>) + { + _Tp __val = __vfn(); + // If the wait is not proxied, set the value that we're waiting + // to change. + _M_old = __builtin_bit_cast(__platform_wait_t, __val); + return __val; + } + else + { + // Otherwise, it's a proxy wait and the proxy's _M_ver is used. + // This load must happen before the one done by __vfn(). + _M_load_proxy_wait_val(__addr); + return __vfn(); + } + } + private: + // Populates _M_wait_state and _M_old from the proxy for __addr. + void + _M_load_proxy_wait_val(const void* __addr); + template<typename _Tp> static constexpr __wait_flags _S_flags_for(const _Tp*, bool __bare_wait) noexcept @@ -290,161 +202,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __res |= __proxy_wait; return __res; } - - // XXX what is this for? It's never used. - template<typename _Tp> - static int - _S_memory_order_for(const _Tp*, int __order) noexcept - { - if constexpr (__platform_wait_uses_type<_Tp>) - return __order; - return __ATOMIC_ACQUIRE; - } }; - inline auto - __waitable_state::_S_track(__waitable_state*& __state, - const __wait_args_base& __args, - const void* __addr) noexcept - { - struct _Tracker - { - _Tracker() noexcept : _M_st(nullptr) { } - - [[__gnu__::__nonnull__]] - explicit - _Tracker(__waitable_state* __st) noexcept - : _M_st(__st) - { __st->_M_enter_wait(); } - - _Tracker(const _Tracker&) = delete; - _Tracker& operator=(const _Tracker&) = delete; - - ~_Tracker() { if (_M_st) _M_st->_M_leave_wait(); } - - __waitable_state* _M_st; - }; - - if (__args & __wait_flags::__track_contention) - { - // Caller does not externally track contention, - // so we want to increment+decrement __state->_M_waiters - - // First make sure we have a waitable state for the address. - if (!__state) - __state = &__waitable_state::_S_state_for(__addr); - - // This object will increment the number of waiters and - // decrement it again on destruction. - return _Tracker{__state}; - } - return _Tracker{}; // For bare waits caller tracks waiters. - } - using __wait_result_type = pair<bool, __platform_wait_t>; - inline __wait_result_type - __spin_impl(const __platform_wait_t* __addr, const __wait_args_base& __args) - { - __platform_wait_t __val; - for (auto __i = 0; __i < __atomic_spin_count; ++__i) - { - __atomic_load(__addr, &__val, __args._M_order); - if (__val != __args._M_old) - return { true, __val }; - if (__i < __atomic_spin_count_relax) - __detail::__thread_relax(); - else - __detail::__thread_yield(); - } - return { false, __val }; - } + __wait_result_type + __wait_impl(const void* __addr, __wait_args_base&); - inline __wait_result_type - __wait_impl(const void* __addr, const __wait_args_base& __a) - { - __wait_args_base __args = __a; - __waitable_state* __state = nullptr; - - const __platform_wait_t* __wait_addr; - if (__args & __wait_flags::__proxy_wait) - { - __state = &__waitable_state::_S_state_for(__addr); - __wait_addr = &__state->_M_ver; - __atomic_load(__wait_addr, &__args._M_old, __args._M_order); - } - else - __wait_addr = static_cast<const __platform_wait_t*>(__addr); - - if (__args & __wait_flags::__do_spin) - { - auto __res = __detail::__spin_impl(__wait_addr, __args); - if (__res.first) - return __res; - if (__args & __wait_flags::__spin_only) - return __res; - } - - auto __tracker = __waitable_state::_S_track(__state, __args, __addr); - -#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT - __platform_wait(__wait_addr, __args._M_old); - return { false, __args._M_old }; -#else - __platform_wait_t __val; - __atomic_load(__wait_addr, &__val, __args._M_order); - if (__val == __args._M_old) - { - if (!__state) - __state = &__waitable_state::_S_state_for(__addr); - lock_guard<mutex> __l{ __state->_M_mtx }; - __atomic_load(__wait_addr, &__val, __args._M_order); - if (__val == __args._M_old) - __state->_M_cv.wait(__state->_M_mtx); - } - return { false, __val }; -#endif - } - - inline void - __notify_impl(const void* __addr, [[maybe_unused]] bool __all, - const __wait_args_base& __args) - { - __waitable_state* __state = nullptr; - - const __platform_wait_t* __wait_addr; - if (__args & __wait_flags::__proxy_wait) - { - __state = &__waitable_state::_S_state_for(__addr); - // Waiting for *__addr is actually done on the proxy's _M_ver. - __wait_addr = &__state->_M_ver; - __atomic_fetch_add(&__state->_M_ver, 1, __ATOMIC_RELAXED); - // Because the proxy might be shared by several waiters waiting - // on different atomic variables, we need to wake them all so - // they can re-evaluate their conditions to see if they should - // stop waiting or should wait again. - __all = true; - } - else // Use the atomic variable's own address. - __wait_addr = static_cast<const __platform_wait_t*>(__addr); - - if (__args & __wait_flags::__track_contention) - { - if (!__state) - __state = &__waitable_state::_S_state_for(__addr); - if (!__state->_M_waiting()) - return; - } - -#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT - __platform_notify(__wait_addr, __all); -#else - if (!__state) - __state = &__waitable_state::_S_state_for(__addr); - lock_guard<mutex> __l{ __state->_M_mtx }; - __state->_M_cv.notify_all(); -#endif - } + void + __notify_impl(const void* __addr, bool __all, const __wait_args_base&); } // namespace __detail // Wait on __addr while __pred(__vfn()) is false. @@ -456,18 +222,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION bool __bare_wait = false) noexcept { __detail::__wait_args __args{ __addr, __bare_wait }; - _Tp __val = __vfn(); + _Tp __val = __args._M_prep_for_wait_on(__addr, __vfn); while (!__pred(__val)) { - // If the wait is not proxied, set the value that we're waiting - // to change. - if constexpr (__platform_wait_uses_type<_Tp>) - __args._M_old = __builtin_bit_cast(__detail::__platform_wait_t, - __val); - // Otherwise, it's a proxy wait and the proxy's _M_ver is used. - __detail::__wait_impl(__addr, __args); - __val = __vfn(); + __val = __args._M_prep_for_wait_on(__addr, __vfn); } // C++26 will return __val } @@ -490,7 +249,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { auto __pfn = [&](const _Tp& __val) { return !__detail::__atomic_eq(__old, __val); }; - __atomic_wait_address(__addr, __pfn, forward<_ValFn>(__vfn)); + std::__atomic_wait_address(__addr, __pfn, forward<_ValFn>(__vfn)); } template<typename _Tp> @@ -501,6 +260,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __detail::__wait_args __args{ __addr, __bare_wait }; __detail::__notify_impl(__addr, __all, __args); } + _GLIBCXX_END_NAMESPACE_VERSION } // namespace std #endif // __glibcxx_atomic_wait diff --git a/libstdc++-v3/src/c++20/Makefile.am b/libstdc++-v3/src/c++20/Makefile.am index 4f7a6d12a6cf..15e6f3445fb6 100644 --- a/libstdc++-v3/src/c++20/Makefile.am +++ b/libstdc++-v3/src/c++20/Makefile.am @@ -36,7 +36,7 @@ else inst_sources = endif -sources = tzdb.cc format.cc +sources = tzdb.cc format.cc atomic.cc vpath % $(top_srcdir)/src/c++20 diff --git a/libstdc++-v3/src/c++20/Makefile.in b/libstdc++-v3/src/c++20/Makefile.in index d759b8dcc7cd..d9e1615bbca8 100644 --- a/libstdc++-v3/src/c++20/Makefile.in +++ b/libstdc++-v3/src/c++20/Makefile.in @@ -121,7 +121,7 @@ CONFIG_CLEAN_FILES = CONFIG_CLEAN_VPATH_FILES = LTLIBRARIES = $(noinst_LTLIBRARIES) libc__20convenience_la_LIBADD = -am__objects_1 = tzdb.lo format.lo +am__objects_1 = tzdb.lo format.lo atomic.lo @ENABLE_EXTERN_TEMPLATE_TRUE@am__objects_2 = sstream-inst.lo @GLIBCXX_HOSTED_TRUE@am_libc__20convenience_la_OBJECTS = \ @GLIBCXX_HOSTED_TRUE@ $(am__objects_1) $(am__objects_2) @@ -432,7 +432,7 @@ headers = @ENABLE_EXTERN_TEMPLATE_TRUE@inst_sources = \ @ENABLE_EXTERN_TEMPLATE_TRUE@ sstream-inst.cc -sources = tzdb.cc format.cc +sources = tzdb.cc format.cc atomic.cc @GLIBCXX_HOSTED_FALSE@libc__20convenience_la_SOURCES = @GLIBCXX_HOSTED_TRUE@libc__20convenience_la_SOURCES = $(sources) $(inst_sources) diff --git a/libstdc++-v3/src/c++20/atomic.cc b/libstdc++-v3/src/c++20/atomic.cc new file mode 100644 index 000000000000..b9ad66b1ec30 --- /dev/null +++ b/libstdc++-v3/src/c++20/atomic.cc @@ -0,0 +1,468 @@ +// Definitions for <atomic> wait/notify -*- C++ -*- + +// Copyright (C) 2020-2025 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// Under Section 7 of GPL version 3, you are granted additional +// permissions described in the GCC Runtime Library Exception, version +// 3.1, as published by the Free Software Foundation. + +// You should have received a copy of the GNU General Public License and +// a copy of the GCC Runtime Library Exception along with this program; +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +// <http://www.gnu.org/licenses/>. + +#include <bits/version.h> + +#if __glibcxx_atomic_wait +#include <atomic> +#include <bits/atomic_timed_wait.h> +#include <bits/functional_hash.h> +#include <cstdint> +#include <bits/std_mutex.h> // std::mutex, std::__condvar + +#ifdef _GLIBCXX_HAVE_LINUX_FUTEX +# include <cerrno> +# include <climits> +# include <unistd.h> +# include <syscall.h> +# include <bits/functexcept.h> +# include <sys/time.h> +#endif + +#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT +# ifndef _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT +// __waitable_state assumes that we consistently use the same implementation +// (i.e. futex vs mutex+condvar) for timed and untimed waiting. +# error "This configuration is not currently supported" +# endif +#endif + +namespace std +{ +_GLIBCXX_BEGIN_NAMESPACE_VERSION +namespace __detail +{ +namespace +{ +#ifdef _GLIBCXX_HAVE_LINUX_FUTEX + enum class __futex_wait_flags : int + { +#ifdef _GLIBCXX_HAVE_LINUX_FUTEX_PRIVATE + __private_flag = 128, +#else + __private_flag = 0, +#endif + __wait = 0, + __wake = 1, + __wait_bitset = 9, + __wake_bitset = 10, + __wait_private = __wait | __private_flag, + __wake_private = __wake | __private_flag, + __wait_bitset_private = __wait_bitset | __private_flag, + __wake_bitset_private = __wake_bitset | __private_flag, + __bitset_match_any = -1 + }; + + void + __platform_wait(const int* __addr, int __val) noexcept + { + auto __e = syscall (SYS_futex, __addr, + static_cast<int>(__futex_wait_flags::__wait_private), + __val, nullptr); + if (!__e || errno == EAGAIN) + return; + if (errno != EINTR) + __throw_system_error(errno); + } + + void + __platform_notify(const int* __addr, bool __all) noexcept + { + syscall (SYS_futex, __addr, + static_cast<int>(__futex_wait_flags::__wake_private), + __all ? INT_MAX : 1); + } +#endif + + // The state used by atomic waiting and notifying functions. + struct __waitable_state + { + // Don't use std::hardware_destructive_interference_size here because we + // don't want the layout of library types to depend on compiler options. + static constexpr auto _S_align = 64; + + // Count of threads blocked waiting on this state. + alignas(_S_align) __platform_wait_t _M_waiters = 0; + +#ifndef _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT + mutex _M_mtx; + + // This type meets the Cpp17BasicLockable requirements. + void lock() { _M_mtx.lock(); } + void unlock() { _M_mtx.unlock(); } +#else + void lock() { } + void unlock() { } +#endif + + // If we can't do a platform wait on the atomic variable itself, + // we use this member as a proxy for the atomic variable and we + // use this for waiting and notifying functions instead. + alignas(_S_align) __platform_wait_t _M_ver = 0; + +#ifndef _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT + __condvar _M_cv; +#endif + + __waitable_state() = default; + + void + _M_enter_wait() noexcept + { __atomic_fetch_add(&_M_waiters, 1, __ATOMIC_SEQ_CST); } + + void + _M_leave_wait() noexcept + { __atomic_fetch_sub(&_M_waiters, 1, __ATOMIC_RELEASE); } + + bool + _M_waiting() const noexcept + { + __platform_wait_t __res; + __atomic_load(&_M_waiters, &__res, __ATOMIC_SEQ_CST); + return __res != 0; + } + + static __waitable_state& + _S_state_for(const void* __addr) noexcept + { + constexpr __UINTPTR_TYPE__ __ct = 16; + static __waitable_state __w[__ct]; + auto __key = ((__UINTPTR_TYPE__)__addr >> 2) % __ct; + return __w[__key]; + } + }; + + // Scope-based contention tracking. + struct scoped_wait + { + // pre: if track_contention is in flags, then args._M_wait_state != nullptr + explicit + scoped_wait(const __wait_args_base& args) : _M_state(nullptr) + { + if (args & __wait_flags::__track_contention) + { + _M_state = static_cast<__waitable_state*>(args._M_wait_state); + _M_state->_M_enter_wait(); + } + } + + ~scoped_wait() + { + if (_M_state) + _M_state->_M_leave_wait(); + } + + scoped_wait(scoped_wait&&) = delete; + + __waitable_state* _M_state; + }; + + // Scoped lock type + struct waiter_lock + { + // pre: args._M_state != nullptr + explicit + waiter_lock(const __wait_args_base& args) + : _M_state(*static_cast<__waitable_state*>(args._M_wait_state)), + _M_track_contention(args & __wait_flags::__track_contention) + { + _M_state.lock(); + if (_M_track_contention) + _M_state._M_enter_wait(); + } + + waiter_lock(waiter_lock&&) = delete; + + ~waiter_lock() + { + if (_M_track_contention) + _M_state._M_leave_wait(); + _M_state.unlock(); + } + + __waitable_state& _M_state; + bool _M_track_contention; + }; + + constexpr auto __atomic_spin_count_relax = 12; + constexpr auto __atomic_spin_count = 16; + + __wait_result_type + __spin_impl(const __platform_wait_t* __addr, const __wait_args_base& __args) + { + __platform_wait_t __val; + for (auto __i = 0; __i < __atomic_spin_count; ++__i) + { + __atomic_load(__addr, &__val, __args._M_order); + if (__val != __args._M_old) + return { true, __val }; + if (__i < __atomic_spin_count_relax) + __thread_relax(); + else + __thread_yield(); + } + return { false, __val }; + } + + inline __waitable_state* + set_wait_state(const void* addr, __wait_args_base& args) + { + if (args._M_wait_state == nullptr) + args._M_wait_state = &__waitable_state::_S_state_for(addr); + return static_cast<__waitable_state*>(args._M_wait_state); + } + +} // namespace + +// Called for a proxy wait +void +__wait_args::_M_load_proxy_wait_val(const void* addr) +{ + // __glibcxx_assert( *this & __wait_flags::__proxy_wait ); + + // We always need a waitable state for proxy waits. + auto state = set_wait_state(addr, *this); + + // Read the value of the _M_ver counter. + __atomic_load(&state->_M_ver, &_M_old, __ATOMIC_ACQUIRE); +} + +__wait_result_type +__wait_impl(const void* __addr, __wait_args_base& __args) +{ + auto __state = static_cast<__waitable_state*>(__args._M_wait_state); + + const __platform_wait_t* __wait_addr; + + if (__args & __wait_flags::__proxy_wait) + __wait_addr = &__state->_M_ver; + else + __wait_addr = static_cast<const __platform_wait_t*>(__addr); + + if (__args & __wait_flags::__do_spin) + { + auto __res = __detail::__spin_impl(__wait_addr, __args); + if (__res.first) + return __res; + if (__args & __wait_flags::__spin_only) + return __res; + } + +#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT + if (__args & __wait_flags::__track_contention) + set_wait_state(__addr, __args); + scoped_wait s(__args); + __platform_wait(__wait_addr, __args._M_old); + return { false, __args._M_old }; +#else + waiter_lock l(__args); + __platform_wait_t __val; + __atomic_load(__wait_addr, &__val, __args._M_order); + if (__val == __args._M_old) + __state->_M_cv.wait(__state->_M_mtx); + return { false, __val }; +#endif +} + +void +__notify_impl(const void* __addr, [[maybe_unused]] bool __all, + const __wait_args_base& __args) +{ + auto __state = static_cast<__waitable_state*>(__args._M_wait_state); + if (!__state) + __state = &__waitable_state::_S_state_for(__addr); + + [[maybe_unused]] const __platform_wait_t* __wait_addr; + + // Lock mutex so that proxied waiters cannot race with incrementing _M_ver + // and see the old value, then sleep after the increment and notify_all(). + lock_guard __l{ *__state }; + + if (__args & __wait_flags::__proxy_wait) + { + // Waiting for *__addr is actually done on the proxy's _M_ver. + __wait_addr = &__state->_M_ver; + + // Increment _M_ver so that waiting threads see something changed. + // This has to be atomic because the load in _M_load_proxy_wait_val + // is done without the mutex locked. + __atomic_fetch_add(&__state->_M_ver, 1, __ATOMIC_RELEASE); + + // Because the proxy might be shared by several waiters waiting + // on different atomic variables, we need to wake them all so + // they can re-evaluate their conditions to see if they should + // stop waiting or should wait again. + __all = true; + } + else // Use the atomic variable's own address. + __wait_addr = static_cast<const __platform_wait_t*>(__addr); + + if (__args & __wait_flags::__track_contention) + { + if (!__state->_M_waiting()) + return; + } + +#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT + __platform_notify(__wait_addr, __all); +#else + __state->_M_cv.notify_all(); +#endif +} + +// Timed atomic waiting functions + +namespace +{ +#ifdef _GLIBCXX_HAVE_LINUX_FUTEX +// returns true if wait ended before timeout +bool +__platform_wait_until(const __platform_wait_t* __addr, + __platform_wait_t __old, + const __wait_clock_t::time_point& __atime) noexcept +{ + auto __s = chrono::time_point_cast<chrono::seconds>(__atime); + auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s); + + struct timespec __rt = + { + static_cast<std::time_t>(__s.time_since_epoch().count()), + static_cast<long>(__ns.count()) + }; + + if (syscall (SYS_futex, __addr, + static_cast<int>(__futex_wait_flags::__wait_bitset_private), + __old, &__rt, nullptr, + static_cast<int>(__futex_wait_flags::__bitset_match_any))) + { + if (errno == ETIMEDOUT) + return false; + if (errno != EINTR && errno != EAGAIN) + __throw_system_error(errno); + } + return true; +} +#endif // HAVE_LINUX_FUTEX + +#ifndef _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT +bool +__cond_wait_until(__condvar& __cv, mutex& __mx, + const __wait_clock_t::time_point& __atime) +{ + auto __s = chrono::time_point_cast<chrono::seconds>(__atime); + auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s); + + __gthread_time_t __ts = + { + static_cast<std::time_t>(__s.time_since_epoch().count()), + static_cast<long>(__ns.count()) + }; + +#ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT + if constexpr (is_same_v<chrono::steady_clock, __wait_clock_t>) + __cv.wait_until(__mx, CLOCK_MONOTONIC, __ts); + else +#endif + __cv.wait_until(__mx, __ts); + return __wait_clock_t::now() < __atime; +} +#endif // ! HAVE_PLATFORM_TIMED_WAIT + +__wait_result_type +__spin_until_impl(const __platform_wait_t* __addr, + const __wait_args_base& __args, + const __wait_clock_t::time_point& __deadline) +{ + auto __t0 = __wait_clock_t::now(); + using namespace literals::chrono_literals; + + __platform_wait_t __val{}; + auto __now = __wait_clock_t::now(); + for (; __now < __deadline; __now = __wait_clock_t::now()) + { + auto __elapsed = __now - __t0; +#ifndef _GLIBCXX_NO_SLEEP + if (__elapsed > 128ms) + this_thread::sleep_for(64ms); + else if (__elapsed > 64us) + this_thread::sleep_for(__elapsed / 2); + else +#endif + if (__elapsed > 4us) + __thread_yield(); + else if (auto __res = __detail::__spin_impl(__addr, __args); __res.first) + return __res; + + __atomic_load(__addr, &__val, __args._M_order); + if (__val != __args._M_old) + return { true, __val }; + } + return { false, __val }; +} +} // namespace + +__wait_result_type +__wait_until_impl(const void* __addr, __wait_args_base& __args, + const __wait_clock_t::duration& __time) +{ + const __wait_clock_t::time_point __atime(__time); + auto __state = static_cast<__waitable_state*>(__args._M_wait_state); + const __platform_wait_t* __wait_addr; + if (__args & __wait_flags::__proxy_wait) + __wait_addr = &__state->_M_ver; + else + __wait_addr = static_cast<const __platform_wait_t*>(__addr); + + if (__args & __wait_flags::__do_spin) + { + auto __res = __detail::__spin_until_impl(__wait_addr, __args, __atime); + if (__res.first) + return __res; + if (__args & __wait_flags::__spin_only) + return __res; + } + +#ifdef _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT + if (__args & __wait_flags::__track_contention) + set_wait_state(__addr, __args); + scoped_wait s(__args); + if (__platform_wait_until(__wait_addr, __args._M_old, __atime)) + return { true, __args._M_old }; + else + return { false, __args._M_old }; +#else + waiter_lock l(__args); + __platform_wait_t __val; + __atomic_load(__wait_addr, &__val, __args._M_order); + if (__val == __args._M_old + && __cond_wait_until(__state->_M_cv, __state->_M_mtx, __atime)) + return { true, __val }; + return { false, __val }; +#endif +} + +} // namespace __detail +_GLIBCXX_END_NAMESPACE_VERSION +} // namespace std +#endif diff --git a/libstdc++-v3/testsuite/17_intro/headers/c++1998/49745.cc b/libstdc++-v3/testsuite/17_intro/headers/c++1998/49745.cc index 7fafe7b64b0c..3b9d2ebd910e 100644 --- a/libstdc++-v3/testsuite/17_intro/headers/c++1998/49745.cc +++ b/libstdc++-v3/testsuite/17_intro/headers/c++1998/49745.cc @@ -131,5 +131,3 @@ #endif int truncate = 0; - -// { dg-xfail-if "PR libstdc++/99995" { c++20 } } diff --git a/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/100334.cc b/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/100334.cc index 58a0da6e6def..21ff570ce20b 100644 --- a/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/100334.cc +++ b/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/100334.cc @@ -53,9 +53,11 @@ main() atom->store(0); } +#if 0 auto a = &std::__detail::__waitable_state::_S_state_for((void*)(atomics.a[0])); auto b = &std::__detail::__waitable_state::_S_state_for((void*)(atomics.a[1])); VERIFY( a == b ); +#endif auto fut0 = std::async(std::launch::async, [&] { atomics.a[0]->wait(0); }); auto fut1 = std::async(std::launch::async, [&] { atomics.a[1]->wait(0); }); diff --git a/libstdc++-v3/testsuite/util/testsuite_abi.cc b/libstdc++-v3/testsuite/util/testsuite_abi.cc index 90cda2fbca83..7bffc6b74f75 100644 --- a/libstdc++-v3/testsuite/util/testsuite_abi.cc +++ b/libstdc++-v3/testsuite/util/testsuite_abi.cc @@ -216,6 +216,7 @@ check_version(symbol& test, bool added) known_versions.push_back("GLIBCXX_3.4.32"); known_versions.push_back("GLIBCXX_3.4.33"); known_versions.push_back("GLIBCXX_3.4.34"); + known_versions.push_back("GLIBCXX_3.4.35"); known_versions.push_back("GLIBCXX_LDBL_3.4.31"); known_versions.push_back("GLIBCXX_IEEE128_3.4.29"); known_versions.push_back("GLIBCXX_IEEE128_3.4.30"); -- 2.49.0