CCing Tom's current address, as he's not @redhat.com now.
On Mon, 11 Dec 2023, 19:24 Nate Eldredge, <[email protected]> wrote:
> On Mon, 11 Dec 2023, Nate Eldredge wrote:
>
> > To fix, we need something like `__args._M_old = __val;` inside the loop
> in
> > __atomic_wait_address(), so that we always wait on the exact value that
> the
> > predicate __pred() rejected. Again, there are similar instances in
> > atomic_timed_wait.h.
>
> Thinking through this, there's another problem. The main loop in
> __atomic_wait_address() would then be:
>
> while (!__pred(__val))
> {
> __args._M_old = __val;
> __detail::__wait_impl(__wait_addr, &__args);
> __val = __vfn();
> }
>
> The idea being that we only call __wait_impl to wait on a value that the
> predicate said was unacceptable. But looking for instance at its caller
> __atomic_semaphore::_M_acquire() in bits/semaphore_base.h, the predicate
> passed in is _S_do_try_acquire(), whose code is:
>
> _S_do_try_acquire(__detail::__platform_wait_t* __counter,
> __detail::__platform_wait_t __old) noexcept
> {
> if (__old == 0)
> return false;
>
> return __atomic_impl::compare_exchange_strong(__counter,
> __old, __old - 1,
> memory_order::acquire,
>
> memory_order::relaxed);
> }
>
> It returns false if the value passed in was unacceptable (i.e. zero), *or*
> if it was nonzero (let's say 1) but the compare_exchange failed because
> another thread swooped in and modified the semaphore counter. In that
> latter case, __atomic_wait_address() would pass 1 to __wait_impl(), which
> is likewise bad. If the counter is externally changed back to 1 just
> before we call __platform_wait (that's the futex call), we would go to
> sleep waiting on a semaphore that is already available: deadlock.
>
> I guess there's a couple ways to fix it.
>
> You could have the "predicate" callback instead return a tri-state value:
> "all done, stop waiting" (like current true), "value passed is not
> acceptable" (like current false), and "value was acceptable but something
> else went wrong". Only the second case should result in calling
> __wait_impl(). In the third case, __atomic_wait_address() should
> just reload the value (using __vfn()) and loop again.
>
> Or, make the callback __pred() a pure predicate that only tests its input
> value for acceptability, without any other side effects. Then have
> __atomic_wait_address() simply return as soon as __pred(__val) returns
> true. It would be up to the caller to actually decrement the semaphore or
> whatever, and to call __atomic_wait_address() again if this fails. In
> that case, __atomic_wait_address should probably return the final value
> that was read, so the caller can immediately do something like a
> compare-exchange using it, and not have to do an additional load and
> predicate test.
>
> Or, make __pred() a pure predicate as before, and give
> __atomic_wait_address yet one more callback function argument, call it
> __taker(), whose job is to acquire the semaphore etc, and have
> __atomic_wait_address call it after __pred(__val) returns true.
>
> --
> Nate Eldredge
> [email protected]
>
>