* Kevin Wolf (kw...@redhat.com) wrote: > Am 02.01.2020 um 14:25 hat Dr. David Alan Gilbert geschrieben: > > * Kevin Wolf (kw...@redhat.com) wrote: > > > Am 19.12.2019 um 15:26 hat Max Reitz geschrieben: > > > > On 17.12.19 15:59, Kevin Wolf wrote: > > > > > This tests creating an external snapshot with VM state (which results > > > > > in > > > > > an active overlay over an inactive backing file, which is also the > > > > > root > > > > > node of an inactive BlockBackend), re-activating the images and > > > > > performing some operations to test that the re-activation worked as > > > > > intended. > > > > > > > > > > Signed-off-by: Kevin Wolf <kw...@redhat.com> > > > > > > > > [...] > > > > > > > > > diff --git a/tests/qemu-iotests/280.out b/tests/qemu-iotests/280.out > > > > > new file mode 100644 > > > > > index 0000000000..5d382faaa8 > > > > > --- /dev/null > > > > > +++ b/tests/qemu-iotests/280.out > > > > > @@ -0,0 +1,50 @@ > > > > > +Formatting 'TEST_DIR/PID-base', fmt=qcow2 size=67108864 > > > > > cluster_size=65536 lazy_refcounts=off refcount_bits=16 > > > > > + > > > > > +=== Launch VM === > > > > > +Enabling migration QMP events on VM... > > > > > +{"return": {}} > > > > > + > > > > > +=== Migrate to file === > > > > > +{"execute": "migrate", "arguments": {"uri": "exec:cat > /dev/null"}} > > > > > +{"return": {}} > > > > > +{"data": {"status": "setup"}, "event": "MIGRATION", "timestamp": > > > > > {"microseconds": "USECS", "seconds": "SECS"}} > > > > > +{"data": {"status": "active"}, "event": "MIGRATION", "timestamp": > > > > > {"microseconds": "USECS", "seconds": "SECS"}} > > > > > +{"data": {"status": "completed"}, "event": "MIGRATION", "timestamp": > > > > > {"microseconds": "USECS", "seconds": "SECS"}} > > > > > + > > > > > +VM is now stopped: > > > > > +completed > > > > > +{"execute": "query-status", "arguments": {}} > > > > > +{"return": {"running": false, "singlestep": false, "status": > > > > > "postmigrate"}} > > > > > > > > Hmmm, I get a finish-migrate status here (on tmpfs)... > > > > > > Dave, is it intentional that the "completed" migration event is emitted > > > while we are still in finish-migration rather than postmigrate? > > > > Yes it looks like it; it's that the migration state machine hits > > COMPLETED that then _causes_ the runstate transitition to POSTMIGRATE. > > > > static void migration_iteration_finish(MigrationState *s) > > { > > /* If we enabled cpu throttling for auto-converge, turn it off. */ > > cpu_throttle_stop(); > > > > qemu_mutex_lock_iothread(); > > switch (s->state) { > > case MIGRATION_STATUS_COMPLETED: > > migration_calculate_complete(s); > > runstate_set(RUN_STATE_POSTMIGRATE); > > break; > > > > then there are a bunch of error cases where if it landed in > > FAILED/CANCELLED etc then we either restart the VM or also go to > > POSTMIGRATE. > > Yes, I read the code. My question was more if there is a reason why we > want things to look like this in the external interface. > > I just thought that it was confusing that migration is already called > completed when it will still change the runstate. But I guess the > opposite could be confusing as well (if we're in postmigrate, why should > the migration status still change?) > > > > I guess we could change wait_migration() in qemu-iotests to wait for the > > > postmigrate state rather than the "completed" event, but maybe it would > > > be better to change the migration code to avoid similar races in other > > > QMP clients. > > > > Given that the migration state machine is driving the runstate state > > machine I think it currently makes sense internally; (although I don't > > think it's documented to be in that order or tested to be, which we > > might want to fix). > > In any case, I seem to remember that it's inconsistent between source > and destination. On one side, the migration status is updated first, on > the other side the runstate is updated first.
(Digging through old mails) That might be partially due to my ed1f30 from 2015 where I move the COMPLETED event later - prior to that it was much too early; before the network announce and before the bdrv_invalidate_cache_all, and I ended up moving it right to the end - it might have been better to leave it before the runstate change. > > Looking at 234 and 262, it looks like you're calling wait_migration on > > both the source and dest; I don't think the dest will see the > > POSTMIGRATE. Also note that depending what you're trying to do, with > > postcopy you'll be running on the destination before you see COMPLETED. > > > > Waiting for the destination to leave 'inmigrate' state is probably > > the best strategy; then wait for the source to be in postmigrate. > > You can cause early exits if you see transitions to 'FAILED' - but > > actually the destination will likely quit in that case; so it should > > be much rarer for you to hit a timeout on a failed migration. > > Commit 37ff7d70 changed it to wait for "postmigrate" on the source and > "running" on the destination, which I guess is good enough for a test > case that doesn't expect failure. Dave > Kevin -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK