https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104928
--- Comment #17 from Jonathan Wakely <redi at gcc dot gnu.org> --- Using _S_do_try_acquire as the predicate for __atomic_wait_address is wrong, because it can fail for two reasons: __old == 0, meaning we can't acquire the semaphore and need to do an atomic wait, or because the compare_exchange fails because another thread changed the counter concurrently. In the latter case we just want to retry the compare_exchange, *not* wait for it to change (because it might never change). We should only wait when the counter is zero. --- a/libstdc++-v3/include/bits/semaphore_base.h +++ b/libstdc++-v3/include/bits/semaphore_base.h @@ -71,7 +71,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } static _GLIBCXX_ALWAYS_INLINE bool - _S_do_try_acquire(__count_type* __counter, __count_type __old) noexcept + _S_do_try_acquire(__count_type* __counter, __count_type& __old) noexcept { if (__old == 0) return false; @@ -82,14 +82,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION memory_order::relaxed); } - _GLIBCXX_ALWAYS_INLINE void + void _M_acquire() noexcept { auto const __vfn = [this]{ return _S_get_current(&this->_M_counter); }; - auto const __pred = [this](__count_type __cur) { - return _S_do_try_acquire(&this->_M_counter, __cur); + auto __val = __vfn(); + auto const __pred = [&__val](__count_type __cur) { + if (__cur > 0) + { + __val = __cur; + return true; + } + return false; }; - std::__atomic_wait_address(&_M_counter, __pred, __vfn, true); + while (!_S_do_try_acquire(&_M_counter, __val)) + { + if (__val == 0) + std::__atomic_wait_address(&_M_counter, __pred, __vfn, true); + } } bool This changes __semaphore_base::_M_acquire to try the compare_exchange in a loop, only waiting when the value is zero. The predicate for the atomic wait is now that the value is positive, so that we don't wait after a compare_exchange that fails due to contention (the predicate and _S_do_try_acquire also update the local variable so that we don't need to load it again). Similar changes are needed to _M_try_acquire, _M_try_acquire_until, and _M_try_acquire_for, but I'll do that on Monday. It can also be simplified to use __atomic_wait_address_v for the case where __count_type is__platform_wait_t, but that can be done separately.