Re: [Qemu-devel] [Qemu-block] [PATCH v1 1/1] iotests: fix test case 185

Kevin Wolf Tue, 20 Mar 2018 03:12:13 -0700

Am 19.03.2018 um 18:53 hat Christian Borntraeger geschrieben:
> 
> 
> On 03/19/2018 05:10 PM, Stefan Hajnoczi wrote:
> > On Mon, Mar 19, 2018 at 9:35 AM, QingFeng Hao <ha...@linux.vnet.ibm.com> 
> > wrote:
> >> Test case 185 failed since commit 4486e89c219 --- "vl: introduce 
> >> vm_shutdown()".
> >> It's because of the newly introduced function vm_shutdown calls 
> >> bdrv_drain_all,
> >> which is called later by bdrv_close_all. bdrv_drain_all resumes the jobs
> >> that doubles the speed and offset is doubled.
> >> Some jobs' status are changed as well.
> >>
> >> Thus, let's not call bdrv_drain_all in vm_shutdown.
> >>
> >> Signed-off-by: QingFeng Hao <ha...@linux.vnet.ibm.com>
> >> ---
> >>  cpus.c | 5 +++--
> >>  1 file changed, 3 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/cpus.c b/cpus.c
> >> index 2e6701795b..ae2962508c 100644
> >> --- a/cpus.c
> >> +++ b/cpus.c
> >> @@ -1006,8 +1006,9 @@ static int do_vm_stop(RunState state, bool send_stop)
> >>              qapi_event_send_stop(&error_abort);
> >>          }
> >>      }
> >> -
> >> -    bdrv_drain_all();
> >> +    if (send_stop) {
> >> +        bdrv_drain_all();
> >> +    }
> > 
> > Thanks for looking into this bug!
> > 
> > This if statement breaks the contract of the function when send_stop
> > is false.  Drain ensures that pending I/O completes and that device
> > emulation finishes before this function returns.  This is important
> > for live migration, where there must be no more guest-related activity
> > after vm_stop().
> > 
> > This patch relies on the caller invoking bdrv_close_all() immediately
> > after do_vm_stop(), which is not documented and could lead to more
> > bugs in the future.
> > 
> > I suggest a different solution:
> > 
> > Test 185 relies on internal QEMU behavior which can change from time
> > to time.  Buffer sizes and block job iteration counts are
> > implementation details that are not part of qapi-schema.json or other
> > documented behavior.
> > 
> > In fact, the existing test case was already broken in IOThread mode
> > since iothread_stop_all() -> bdrv_set_aio_context() also includes a
> > bdrv_drain() with the side-effect of an extra blockjob iteration.
> > 
> > After 4486e89c219 ("vl: introduce vm_shutdown()") both IOThread and
> > non-IOThread mode perform the drain.  This has caused the test
> > failure.
> > 
> > Instead of modifying QEMU to keep the same arbitrary internal behavior
> > as before, please send a patch that updates tests/qemu-iotests/185.out
> > with the latest output.
> 
> Wouldnt it be better if the test actually masks out the values the are not
> really part of the "agreed upon" behaviour? Wouldnt block_job_offset from
> common.filter be what we want?


The test case has the following note:

# Note that the reference output intentionally includes the 'offset' field in
# BLOCK_JOB_CANCELLED events for all of the following block jobs. They are
# predictable and any change in the offsets would hint at a bug in the job
# throttling code.

Now the question is if this particular change is okay rather than
hinting at a bug and we should update the reference output or whether we
need to change qemu again.

I think the change isn't really bad, but we are doing more useless work
now than we used to do before, and we're exiting slower because we're
spawning additional I/O that we have to wait for. So the better state
was certainly the old one.

Kevin

Re: [Qemu-devel] [Qemu-block] [PATCH v1 1/1] iotests: fix test case 185

Reply via email to