Re: [PATCH 3/3] iotests: Test external snapshot with VM state

Dr. David Alan Gilbert Mon, 10 Feb 2020 04:31:59 -0800

* Kevin Wolf (kw...@redhat.com) wrote:
> Am 02.01.2020 um 14:25 hat Dr. David Alan Gilbert geschrieben:
> > * Kevin Wolf (kw...@redhat.com) wrote:
> > > Am 19.12.2019 um 15:26 hat Max Reitz geschrieben:
> > > > On 17.12.19 15:59, Kevin Wolf wrote:
> > > > > This tests creating an external snapshot with VM state (which results 
> > > > > in
> > > > > an active overlay over an inactive backing file, which is also the 
> > > > > root
> > > > > node of an inactive BlockBackend), re-activating the images and
> > > > > performing some operations to test that the re-activation worked as
> > > > > intended.
> > > > > 
> > > > > Signed-off-by: Kevin Wolf <kw...@redhat.com>
> > > > 
> > > > [...]
> > > > 
> > > > > diff --git a/tests/qemu-iotests/280.out b/tests/qemu-iotests/280.out
> > > > > new file mode 100644
> > > > > index 0000000000..5d382faaa8
> > > > > --- /dev/null
> > > > > +++ b/tests/qemu-iotests/280.out
> > > > > @@ -0,0 +1,50 @@
> > > > > +Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=67108864 
> > > > > cluster_size=65536 lazy_refcounts=off refcount_bits=16
> > > > > +
> > > > > +=== Launch VM ===
> > > > > +Enabling migration QMP events on VM...
> > > > > +{"return": {}}
> > > > > +
> > > > > +=== Migrate to file ===
> > > > > +{"execute": "migrate", "arguments": {"uri": "exec:cat > /dev/null"}}
> > > > > +{"return": {}}
> > > > > +{"data": {"status": "setup"}, "event": "MIGRATION", "timestamp": 
> > > > > {"microseconds": "USECS", "seconds": "SECS"}}
> > > > > +{"data": {"status": "active"}, "event": "MIGRATION", "timestamp": 
> > > > > {"microseconds": "USECS", "seconds": "SECS"}}
> > > > > +{"data": {"status": "completed"}, "event": "MIGRATION", "timestamp": 
> > > > > {"microseconds": "USECS", "seconds": "SECS"}}
> > > > > +
> > > > > +VM is now stopped:
> > > > > +completed
> > > > > +{"execute": "query-status", "arguments": {}}
> > > > > +{"return": {"running": false, "singlestep": false, "status": 
> > > > > "postmigrate"}}
> > > > 
> > > > Hmmm, I get a finish-migrate status here (on tmpfs)...
> > > 
> > > Dave, is it intentional that the "completed" migration event is emitted
> > > while we are still in finish-migration rather than postmigrate?
> > 
> > Yes it looks like it;  it's that the migration state machine hits
> > COMPLETED that then _causes_ the runstate transitition to POSTMIGRATE.
> > 
> > static void migration_iteration_finish(MigrationState *s)
> > {
> >     /* If we enabled cpu throttling for auto-converge, turn it off. */
> >     cpu_throttle_stop();
> > 
> >     qemu_mutex_lock_iothread();
> >     switch (s->state) {
> >     case MIGRATION_STATUS_COMPLETED:
> >         migration_calculate_complete(s);
> >         runstate_set(RUN_STATE_POSTMIGRATE);
> >         break;
> > 
> > then there are a bunch of error cases where if it landed in
> > FAILED/CANCELLED etc then we either restart the VM or also go to
> > POSTMIGRATE.
> 
> Yes, I read the code. My question was more if there is a reason why we
> want things to look like this in the external interface.
> 
> I just thought that it was confusing that migration is already called
> completed when it will still change the runstate. But I guess the
> opposite could be confusing as well (if we're in postmigrate, why should
> the migration status still change?)
> 
> > > I guess we could change wait_migration() in qemu-iotests to wait for the
> > > postmigrate state rather than the "completed" event, but maybe it would
> > > be better to change the migration code to avoid similar races in other
> > > QMP clients.
> > 
> > Given that the migration state machine is driving the runstate state
> > machine I think it currently makes sense internally;  (although I don't
> > think it's documented to be in that order or tested to be, which we
> > might want to fix).
> 
> In any case, I seem to remember that it's inconsistent between source
> and destination. On one side, the migration status is updated first, on
> the other side the runstate is updated first.


(Digging through old mails)

That might be partially due to my ed1f30 from 2015 where I move the
COMPLETED event later - prior to that it was much too early; before
the network announce and before the bdrv_invalidate_cache_all, and I
ended up moving it right to the end - it might have been better to leave
it before the runstate change.



> > Looking at 234 and 262, it looks like you're calling wait_migration on
> > both the source and dest; I don't think the dest will see the
> > POSTMIGRATE.  Also note that depending what you're trying to do, with
> > postcopy you'll be running on the destination before you see COMPLETED.
> > 
> > Waiting for the destination to leave 'inmigrate' state is probably
> > the best strategy; then wait for the source to be in postmigrate.
> > You can cause early exits if you see transitions to 'FAILED' - but
> > actually the destination will likely quit in that case; so it should
> > be much rarer for you to hit a timeout on a failed migration.
> 
> Commit 37ff7d70 changed it to wait for "postmigrate" on the source and
> "running" on the destination, which I guess is good enough for a test
> case that doesn't expect failure.

Dave

> Kevin
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH 3/3] iotests: Test external snapshot with VM state

Reply via email to