Re: [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all

Paolo Bonzini Wed, 13 May 2015 04:35:20 -0700


On 13/05/2015 13:08, Fam Zheng wrote:
> > I think this isn't enough.  It's the callers of bdrv_drain and
> > bdrv_drain_all that need to block before drain and unblock before
> > aio_context_release.
> 
> Which callers do you mean? qmp_transaction is covered in this series.


All of them. :(

In some cases it may be unnecessary.  I'm thinking of bdrv_set_aio_context,
and mig_save_device_dirty, but that's probably the exception rather than
the rule.

In some cases, like bdrv_snapshot_delete, the change is trivial.

The only somewhat more complex case is probably block/mirror.c.  There,
the aio_context_release happens through block_job_sleep_ns or
qemu_coroutine_yield.  I guess the change would look something like
this sketch:

diff --git a/block/mirror.c b/block/mirror.c
index 58f391a..dcfede0 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -513,16 +513,29 @@ static void coroutine_fn mirror_run(void *opaque)
 
         if (cnt == 0 && should_complete) {
             /* The dirty bitmap is not updated while operations are pending.
-             * If we're about to exit, wait for pending operations before
-             * calling bdrv_get_dirty_count(bs), or we may exit while the
+             * If we're about to exit, wait for pending operations and
+             * recheck bdrv_get_dirty_count(bs), or we may exit while the
              * source has dirty data to copy!
              *
              * Note that I/O can be submitted by the guest while
-             * mirror_populate runs.
+             * the rest of mirror_populate runs, but we lock it out here.
              */
             trace_mirror_before_drain(s, cnt);
+
+            // ... block ...
             bdrv_drain(bs);
             cnt = bdrv_get_dirty_count(s->dirty_bitmap);
+
+            if (cnt == 0) {
+                /* The two disks are in sync.  Exit and report successful
+                 * completion.
+                 */
+                assert(s->synced && QLIST_EMPTY(&bs->tracked_requests));
+                s->common.cancelled = false;
+                break; // ... and unblock somewhere else...
+            }
+
+            // ... unblock ...
         }
 
         ret = 0;
@@ -535,13 +549,6 @@ static void coroutine_fn mirror_run(void *opaque)
         } else if (!should_complete) {
             delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
             block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
-        } else if (cnt == 0) {
-            /* The two disks are in sync.  Exit and report successful
-             * completion.
-             */
-            assert(QLIST_EMPTY(&bs->tracked_requests));
-            s->common.cancelled = false;
-            break;
         }
         last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     }


It can be the topic of a separate series.  But this patch brings a
false sense of security (either the blocker is unnecessary, or it
needs to last after bdrv_drain returns), so I think it should be
dropped.

Paolo

Re: [Qemu-devel] [PATCH v2 11/11] block: Block "device IO" during bdrv_drain and bdrv_drain_all

Reply via email to