On 3/12/24 14:34, Cédric Le Goater wrote:
On 3/12/24 13:32, Cédric Le Goater wrote:
On 3/11/24 20:03, Fabiano Rosas wrote:
Cédric Le Goater <c...@redhat.com> writes:
On 3/8/24 15:36, Fabiano Rosas wrote:
Cédric Le Goater <c...@redhat.com> writes:
This prepares ground for the changes coming next which add an Error**
argument to the .save_setup() handler. Callers of qemu_savevm_state_setup()
now handle the error and fail earlier setting the migration state from
MIGRATION_STATUS_SETUP to MIGRATION_STATUS_FAILED.
In qemu_savevm_state(), move the cleanup to preserve the error
reported by .save_setup() handlers.
Since the previous behavior was to ignore errors at this step of
migration, this change should be examined closely to check that
cleanups are still correctly done.
Signed-off-by: Cédric Le Goater <c...@redhat.com>
---
Changes in v4:
- Merged cleanup change in qemu_savevm_state()
Changes in v3:
- Set migration state to MIGRATION_STATUS_FAILED
- Fixed error handling to be done under lock in bg_migration_thread()
- Made sure an error is always set in case of failure in
qemu_savevm_state_setup()
migration/savevm.h | 2 +-
migration/migration.c | 27 ++++++++++++++++++++++++---
migration/savevm.c | 26 +++++++++++++++-----------
3 files changed, 40 insertions(+), 15 deletions(-)
diff --git a/migration/savevm.h b/migration/savevm.h
index
74669733dd63a080b765866c703234a5c4939223..9ec96a995c93a42aad621595f0ed58596c532328
100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -32,7 +32,7 @@
bool qemu_savevm_state_blocked(Error **errp);
void qemu_savevm_non_migratable_list(strList **reasons);
int qemu_savevm_state_prepare(Error **errp);
-void qemu_savevm_state_setup(QEMUFile *f);
+int qemu_savevm_state_setup(QEMUFile *f, Error **errp);
bool qemu_savevm_state_guest_unplug_pending(void);
int qemu_savevm_state_resume_prepare(MigrationState *s);
void qemu_savevm_state_header(QEMUFile *f);
diff --git a/migration/migration.c b/migration/migration.c
index
a49fcd53ee19df1ce0182bc99d7e064968f0317b..6d1544224e96f5edfe56939a9c8395d88ef29581
100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3408,6 +3408,8 @@ static void *migration_thread(void *opaque)
int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
MigThrError thr_error;
bool urgent = false;
+ Error *local_err = NULL;
+ int ret;
thread = migration_threads_add("live_migration", qemu_get_thread_id());
@@ -3451,9 +3453,17 @@ static void *migration_thread(void *opaque)
}
bql_lock();
- qemu_savevm_state_setup(s->to_dst_file);
+ ret = qemu_savevm_state_setup(s->to_dst_file, &local_err);
bql_unlock();
+ if (ret) {
+ migrate_set_error(s, local_err);
+ error_free(local_err);
+ migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
+ MIGRATION_STATUS_FAILED);
+ goto out;
+ }
+
qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
MIGRATION_STATUS_ACTIVE);
This^ should be before the new block it seems:
GOOD:
migrate_set_state new state setup
migrate_set_state new state wait-unplug
migrate_fd_cancel
migrate_set_state new state cancelling
migrate_fd_cleanup
migrate_set_state new state cancelled
migrate_fd_cancel
ok 1 /x86_64/failover-virtio-net/migrate/abort/wait-unplug
BAD:
migrate_set_state new state setup
migrate_fd_cancel
migrate_set_state new state cancelling
migrate_fd_cleanup
migrate_set_state new state cancelled
qemu-system-x86_64: ram_save_setup failed: Input/output error
**
ERROR:../tests/qtest/virtio-net-failover.c:1203:test_migrate_abort_wait_unplug:
assertion failed (status == "cancelling"): ("cancelled" == "cancelling")
Otherwise migration_iteration_finish() will schedule the cleanup BH and
that will run concurrently with migrate_fd_cancel() issued by the test
and bad things happens.
This hack makes things work :
@@ -3452,6 +3452,9 @@ static void *migration_thread(void *opaq
qemu_savevm_send_colo_enable(s->to_dst_file);
}
+ qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
+ MIGRATION_STATUS_SETUP);
+
Why move it all the way up here? Has moving the wait_unplug before the
'if (ret)' block not worked for you?
We could be sleeping while holding the BQL. It looked wrong.
Sorry wrong answer. Yes I can try moving it before the 'if (ret)' block.
I can reproduce easily with an x86 guest running on PPC64.
That works just the same.
Peter, Fabiano,
What would you prefer ?
1. move qemu_savevm_wait_unplug() before qemu_savevm_state_setup(),
means one new patch.
2. leave qemu_savevm_wait_unplug() after qemu_savevm_state_setup()
and handle state_setup() errors after waiting. means an update
of this patch.
Thanks,
C.