On Mon, Sep 30, 2024 at 12:40:35PM -0700, Steve Sistare wrote: > Stop the vm earlier for cpr, to guarantee consistent device state when > CPR state is saved.
Could you add some more info on why this order matters? E.g., qmp_migrate should switch migration state machine to SETUP, while this path holds BQL, I think it means there's no way devices got hot added concurrently of the whole process. Would other things change in the cpr states (name, fd, etc.)? It'll be great to mention these details in the commit message. Thanks, > > Signed-off-by: Steve Sistare <steven.sist...@oracle.com> > --- > migration/migration.c | 22 +++++++++++++--------- > 1 file changed, 13 insertions(+), 9 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c > index df00e5c..868bf0e 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -2082,6 +2082,7 @@ void qmp_migrate(const char *uri, bool has_channels, > MigrationState *s = migrate_get_current(); > g_autoptr(MigrationChannel) channel = NULL; > MigrationAddress *addr = NULL; > + bool stopped = false; > > /* > * Having preliminary checks for uri and channel > @@ -2125,6 +2126,15 @@ void qmp_migrate(const char *uri, bool has_channels, > } > } > > + if (migrate_mode_is_cpr(s)) { > + int ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE); > + if (ret < 0) { > + error_setg(&local_err, "migration_stop_vm failed, error %d", > -ret); > + goto out; > + } > + stopped = true; > + } > + > if (cpr_state_save(&local_err)) { > goto out; > } > @@ -2160,6 +2170,9 @@ out: > } > migrate_fd_error(s, local_err); > error_propagate(errp, local_err); > + if (stopped) { > + vm_resume(s->vm_old_state); > + } > return; > } > } > @@ -3743,7 +3756,6 @@ void migrate_fd_connect(MigrationState *s, Error > *error_in) > Error *local_err = NULL; > uint64_t rate_limit; > bool resume = (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP); > - int ret; > > /* > * If there's a previous error, free it and prepare for another one. > @@ -3815,14 +3827,6 @@ void migrate_fd_connect(MigrationState *s, Error > *error_in) > return; > } > > - if (migrate_mode_is_cpr(s)) { > - ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE); > - if (ret < 0) { > - error_setg(&local_err, "migration_stop_vm failed, error %d", > -ret); > - goto fail; > - } > - } > - > if (migrate_background_snapshot()) { > qemu_thread_create(&s->thread, "mig/snapshot", > bg_migration_thread, s, QEMU_THREAD_JOINABLE); > -- > 1.8.3.1 > -- Peter Xu