Peter Xu <pet...@redhat.com> writes:

> Both dump-guest-memory and live migration have vm state cached internally.
> Allowing them to happen together means the vm state can be messed up.  Simply
> block live migration for dump-guest-memory.
>
> One trivial thing to mention is we should still allow dump-guest-memory even 
> if
> -only-migratable is specified, because that flag should majorly be used to
> guarantee not adding devices that will block migration by accident.  Dump 
> guest
> memory is not like that - it'll only block for the seconds when it's dumping.

I recently ran into a similarly unusual use of migration blockers:

    Subject: -only-migrate and the two different uses of migration blockers
     (was: spapr_events: Sure we may ignore migrate_add_blocker() failure?)
    Date: Mon, 19 Jul 2021 13:00:20 +0200 (5 weeks, 1 day, 20 hours ago)
    Message-ID: <87sg0amuuz.fsf...@dusky.pond.sub.org>

    We appear to use migration blockers in two ways:

    (1) Prevent migration for an indefinite time, typically due to use of
    some feature that isn't compatible with migration.

    (2) Delay migration for a short time.

    Option -only-migrate is designed for (1).  It interferes with (2).

    Example for (1): device "x-pci-proxy-dev" doesn't support migration.  It
    adds a migration blocker on realize, and deletes it on unrealize.  With
    -only-migrate, device realize fails.  Works as designed.

    Example for (2): spapr_mce_req_event() makes an effort to prevent
    migration degrate the reporting of FWNMIs.  It adds a migration blocker
    when it receives one, and deletes it when it's done handling it.  This
    is a best effort; if migration is already in progress by the time FWNMI
    is received, we simply carry on, and that's okay.  However, option
    -only-migrate sabotages the best effort entirely.

    While this isn't exactly terrible, it may be a weakness in our thinking
    and our infrastructure.  I'm bringing it up so the people in charge are
    aware :)

https://lists.nongnu.org/archive/html/qemu-devel/2021-07/msg04723.html

Downthread there, Dave Gilbert opined

    It almost feels like they need a way to temporarily hold off
    'completion' of migratio - i.e. the phase where we stop the CPU and
    write the device data;  mind you you'd also probably want it to stop
    cold-migrates/snapshots?


Reply via email to