On 2025/05/27 1:51, Paolo Bonzini wrote:
On 5/26/25 07:29, Akihiko Odaki wrote:
Changes in v4:
- Added patch "qemu-thread: Remove qatomic_read() in qemu_event_set()".

Hi Akihiko,

I'm not so confident about putting this patch before the other changes;
I'm referring basically to this hunk:

diff --git a/util/event.c b/util/event.c
index 366c77c90cf..663b7042b17 100644
--- a/util/event.c
+++ b/util/event.c
@@ -48,22 +48,9 @@ void qemu_event_set(QemuEvent *ev)
      assert(ev->initialized);

  #ifdef HAVE_FUTEX
-    /*
-     * Pairs with both qemu_event_reset() and qemu_event_wait().
-     *
-     * qemu_event_set has release semantics, but because it *loads*
-     * ev->value we need a full memory barrier here.
-     */
-    smp_mb();
-    if (qatomic_read(&ev->value) != EV_SET) {
-        int old = qatomic_xchg(&ev->value, EV_SET);
-
-        /* Pairs with memory barrier in kernel futex_wait system call.  */
-        smp_mb__after_rmw();
-        if (old == EV_BUSY) {
-            /* There were waiters, wake them up.  */
-            qemu_futex_wake_all(ev);
-        }
+    if (qatomic_xchg(&ev->value, EV_SET) == EV_BUSY) {
+        /* There were waiters, wake them up.  */
+        qemu_futex_wake_all(ev);
      }
  #else
      pthread_mutex_lock(&ev->lock);


... feel free to resubmit that separately, also because it's missing
a smp_mb__before_rmw().

I'd like to submit it with "[PATCH v4 05/11] qemu-thread: Avoid futex abstraction for non-Linux" because it aligns the implementations of Linux and non-Linux versions to rely on a store-release of EV_SET in qemu_event_set().

smp_mb__before_rmw() is removed with "[PATCH v4 01/11] futex: Check value after qemu_futex_wait()" because the fact mentioned with the comment "Pairs with memory barrier in kernel futex_wait system call" is no longer relevant; the patch makes QEMU always rely on qatomic_load_acquire() or qatomic_cmpxchg() to perform a load-acquire of EV_SET for ordering.

In either case, I can simply that say the ordering between qemu_event_set() and qemu_event_wait() are ensured by load-acquire and store-release operations of EV_SET, but to make the correctness absolutely sure, I recommend to look at my past explanation:
https://lore.kernel.org/qemu-devel/ab6b66d7-fa8c-4049-9a3b-975f7f9c0...@daynix.com

The explanation is long but so comprehensive that it lists all memory accesses, ordering requirements, and features of atomic primitives and futex employed to satisfy the requirements.



Also, I'm not sure what was your opinion of the more optimized version
of qemu_event_reset:

diff --git a/util/event.c b/util/event.c
index 366c77c90cf..663b7042b17 100644
--- a/util/event.c
+++ b/util/event.c
@@ -78,6 +78,7 @@ void qemu_event_reset(QemuEvent *ev)
  {
      assert(ev->initialized);

+#ifdef HAVE_FUTEX
      /*
       * If there was a concurrent reset (or even reset+wait),
       * do nothing.  Otherwise change EV_SET->EV_FREE.
@@ -86,6 +87,28 @@ void qemu_event_reset(QemuEvent *ev)
       */
      smp_mb__after_rmw();
+#else
+    /*
+     * If futexes are not available, there are no EV_FREE->EV_BUSY
+     * transitions because wakeups are done entirely through the
+     * condition variable.  Since qatomic_set() only writes EV_FREE,
+     * the load seems useless but in reality, the acquire synchronizes
+     * with qemu_event_set()'s store release: if qemu_event_reset()
+     * sees EV_SET here, then the caller will certainly see a
+     * successful condition and skip qemu_event_wait():
+     *
+     * done = 1;                 if (done == 0)
+     * qemu_event_set() {          qemu_event_reset() {
+     *   lock();
+     *   ev->value = EV_SET ----->     load ev->value
+     *                                 ev->value = old value | EV_FREE
+     *   cond_broadcast()
+     *   unlock();                 }
+     * }                           if (done == 0)
+     *                               // qemu_event_wait() not called
+     */
+    qatomic_set(&ev->value, qatomic_load_acquire(&ev->value) | EV_FREE);
+#endif
  }

  void qemu_event_wait(QemuEvent *ev)


Do you think it's incorrect?  I'll wait for your answer before sending
out the actual pull request.

It's correct, but I don't think it's worthwhile.

This code path is only used by platforms without a futex wrapper. Currently we only have one for Linux and this series adds one for Windows, but FreeBSD[1] and OpenBSD[2] have their own futex. macOS also gained one with version 14.4.[3] We can add wrappers for them too if their performance really matters.

So the only platforms listed in docs/about/build-platforms.rst that require the non-futex version are macOS older than 14.4 and NetBSD. macOS older than 14.4 will not be supported after June 5 since macOS 14 was released June 5, 2023 and docs/about/build-platforms.rst says:

> Support for the previous major version will be dropped 2 years after
> the new major version is released or when the vendor itself drops
> support, whichever comes first.

> Within each major release, only the most recent minor release is
> considered.

There are too few relevant platforms to justify the effort potentially needed for quality assurance.

Moreover, qemu_event_reset() is often followed by qemu_event_wait() or other barriers so probably relaxing ordering here does not affect the overall ordering constraint (and performance) much.

Regards,
Akihiko Odaki

[1] https://man.freebsd.org/cgi/man.cgi?query=_umtx_op&apropos=0&sektion=2&manpath=FreeBSD+14.2-RELEASE&arch=default&format=html
[2] https://man.openbsd.org/futex
[3] https://developer.apple.com/documentation/os/synchronization?language=objc#Futex-Conditional-Wait-Primitives

Reply via email to