On 2018-04-10 10:11, Stefan Hajnoczi wrote: > On Wed, Apr 04, 2018 at 06:16:12PM +0200, Max Reitz wrote: >> On 2018-04-04 17:01, Stefan Hajnoczi wrote: >> === Start mirror job and exit qemu === >> >> This seems to be independent of whether there is actually data on >> TEST_IMG (the commit source), so something doesn't seem quite right with >> the block job throttling here...? > > I've been playing around with this test failure. Let's leave it broken > (!) in QEMU 2.12 because this test has uncovered a block job ratelimit > issue that's not a regression. The fix shouldn't be rushed.
OK for me.
> block/mirror.c calculates delay_ns but then discards it:
>
> static void coroutine_fn mirror_run(void *opaque)
> {
> ...
>
> for (;;) {
> ...delay_ns is calculated based on job speed...
>
> ret = 0;
> trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
> if (block_job_is_cancelled(&s->common) && s->common.force) {
> break;
> } else if (!should_complete) {
> delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
> ^--- we discard it!
> block_job_sleep_ns(&s->common, delay_ns);
> }
> s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
> }
>
> ...
> }
Uh, nice.
> This is why the mirror-based tests are completing even though we'd
> expect them to only reach the "next" iteration due to the job speed.
>
> Increasing the 4 MB write operation in the test to >16 MB (the mirror
> buffer size) doesn't solve the problem because the next QMP command will
> race with the job's 0 ns sleep. It would just make the test unreliable.
>
> I'll work on the following for QEMU 2.13:
>
> 1. Avoid spurious block_job_enter() from block_job_drain(). This
> eliminates the extra block job iteration that changed the output in
> the first place.
>
> 2. Honor the job speed in block/mirror.c.
>
> The end result will be that the test output will not require changes.
Thanks!
Max
signature.asc
Description: OpenPGP digital signature
