Hi On Fri, Aug 27, 2021 at 11:44 AM Marc-André Lureau < marcandre.lur...@redhat.com> wrote:
> Hi > > On Thu, Aug 26, 2021 at 10:58 PM Peter Xu <pet...@redhat.com> wrote: > >> Both dump-guest-memory and live migration caches vm state at the >> beginning. >> Either of them entering the other one will cause race on the vm state, >> and even >> more severe on that (please refer to the crash report in the bug link). >> >> Let's block live migration in dump-guest-memory, and that'll also block >> dump-guest-memory if it detected that we're during a live migration. >> >> Side note: migrate_del_blocker() can be called even if the blocker is not >> inserted yet, so it's safe to unconditionally delete that blocker in >> dump_cleanup (g_slist_remove allows no-entry-found case). >> >> Suggested-by: Dr. David Alan Gilbert <dgilb...@redhat.com> >> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1996609 >> Signed-off-by: Peter Xu <pet...@redhat.com> >> --- >> dump/dump.c | 24 +++++++++++++++++++----- >> 1 file changed, 19 insertions(+), 5 deletions(-) >> >> diff --git a/dump/dump.c b/dump/dump.c >> index ab625909f3..9c1c1fb738 100644 >> --- a/dump/dump.c >> +++ b/dump/dump.c >> @@ -29,6 +29,7 @@ >> #include "qemu/error-report.h" >> #include "qemu/main-loop.h" >> #include "hw/misc/vmcoreinfo.h" >> +#include "migration/blocker.h" >> >> #ifdef TARGET_X86_64 >> #include "win_dump.h" >> @@ -47,6 +48,8 @@ >> >> #define MAX_GUEST_NOTE_SIZE (1 << 20) /* 1MB should be enough */ >> >> +static Error *dump_migration_blocker; >> + >> #define ELF_NOTE_SIZE(hdr_size, name_size, desc_size) \ >> ((DIV_ROUND_UP((hdr_size), 4) + \ >> DIV_ROUND_UP((name_size), 4) + \ >> @@ -101,6 +104,7 @@ static int dump_cleanup(DumpState *s) >> qemu_mutex_unlock_iothread(); >> } >> } >> + migrate_del_blocker(dump_migration_blocker); >> >> return 0; >> } >> @@ -1927,11 +1931,6 @@ void qmp_dump_guest_memory(bool paging, const char >> *file, >> Error *local_err = NULL; >> bool detach_p = false; >> >> - if (runstate_check(RUN_STATE_INMIGRATE)) { >> - error_setg(errp, "Dump not allowed during incoming migration."); >> - return; >> - } >> - >> /* if there is a dump in background, we should wait until the dump >> * finished */ >> if (dump_in_progress()) { >> @@ -2005,6 +2004,21 @@ void qmp_dump_guest_memory(bool paging, const char >> *file, >> return; >> } >> >> + if (!dump_migration_blocker) { >> + error_setg(&dump_migration_blocker, >> + "Live migration disabled: dump-guest-memory in >> progress"); >> + } >> + >> + /* >> + * Allows even for -only-migratable, but forbid migration during the >> + * process of dump guest memory. >> + */ >> + if (migrate_add_blocker_internal(dump_migration_blocker, errp)) { >> + /* Remember to release the fd before passing it over to dump >> state */ >> + close(fd); >> + return; >> + } >> > > I would move it earlier. Why not leave it at the beginning of the > function as it was? > > Ah I think it's because dump_cleanup() isn't called when returning earlier. But relying on a cleanup done outside of this function is not obvious. Either dump_cleanup() should be called from here, or the blocker code should be moved to dump_init() imho. > Then dump_in_progress() is probably redundant too. > > + >> s = &dump_state_global; >> dump_state_prepare(s); >> >> -- >> 2.31.1 >> >>