On 31.01.24 11:17, Kevin Wolf wrote:
Am 29.01.2024 um 17:30 hat Hanna Czenczek geschrieben:
I don’t like using drain as a form of lock specifically against AioContext
changes, but maybe Stefan is right, and we should use it in this specific
case to get just the single problem fixed.  (Though it’s not quite trivial
either.  We’d probably still want to remove the assertion from
blk_get_aio_context(), so we don’t have to require all of its callers to
hold a count in the in-flight counter.)
Okay, fair, maybe fixing the specific problem is more important that
solving the more generic blk_get_aio_context() race.

In this case, wouldn't it be enough to increase the in-flight counter so
that the drain before switching AioContexts would run the BH before
anything bad can happen? Does the following work?

Yes, that’s what I had in mind (Stefan, too, I think), and in testing, it looks good.

Hanna


Kevin

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 0a2eb11c56..dc09eb8024 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -120,17 +120,11 @@ static void scsi_device_for_each_req_async_bh(void 
*opaque)
      SCSIRequest *next;
/*
-     * If the AioContext changed before this BH was called then reschedule into
-     * the new AioContext before accessing ->requests. This can happen when
-     * scsi_device_for_each_req_async() is called and then the AioContext is
-     * changed before BHs are run.
+     * The AioContext can't have changed because we increased the in-flight
+     * counter for s->conf.blk.
       */
      ctx = blk_get_aio_context(s->conf.blk);
-    if (ctx != qemu_get_current_aio_context()) {
-        aio_bh_schedule_oneshot(ctx, scsi_device_for_each_req_async_bh,
-                                g_steal_pointer(&data));
-        return;
-    }
+    assert(ctx == qemu_get_current_aio_context());
QTAILQ_FOREACH_SAFE(req, &s->requests, next, next) {
          data->fn(req, data->fn_opaque);
@@ -138,6 +132,7 @@ static void scsi_device_for_each_req_async_bh(void *opaque)
/* Drop the reference taken by scsi_device_for_each_req_async() */
      object_unref(OBJECT(s));
+    blk_dec_in_flight(s->conf.blk);
  }
/*
@@ -163,6 +158,7 @@ static void scsi_device_for_each_req_async(SCSIDevice *s,
       */
      object_ref(OBJECT(s));
+ blk_inc_in_flight(s->conf.blk);
      aio_bh_schedule_oneshot(blk_get_aio_context(s->conf.blk),
                              scsi_device_for_each_req_async_bh,
                              data);



Reply via email to