On 10/8/2024 11:35 AM, Peter Xu wrote:
On Mon, Oct 07, 2024 at 04:52:43PM -0400, Steven Sistare wrote:
On 10/7/2024 11:27 AM, Peter Xu wrote:
On Mon, Sep 30, 2024 at 12:40:35PM -0700, Steve Sistare wrote:
Stop the vm earlier for cpr, to guarantee consistent device state when
CPR state is saved.
Could you add some more info on why this order matters?
E.g., qmp_migrate should switch migration state machine to SETUP, while
this path holds BQL, I think it means there's no way devices got hot added
concurrently of the whole process.
Would other things change in the cpr states (name, fd, etc.)? It'll be
great to mention these details in the commit message.
Because of the new cpr-state save operation needed by this mode,
I created this patch to be future proof. Performing a save operation while
the machine is running is asking for trouble. But right now, I am not aware
of any specific issues.
Later in the "tap and vhost" series there is another reason to stop the vm here
and
save cpr state, because the devices must be stopped in old qemu before they
are initialized in new qemu. If you are curious, see the 2 patches I attached
to the email at
https://lore.kernel.org/qemu-devel/fa95c40d-b5e5-41eb-bba7-7842bca2f...@oracle.com/
But, that has nothing to do with the contents of cpr state.
Then I suggest we leave this patch to the vhost/tap series, then please
document clearly in the commit mesasge on why this is needed. Linking to
that discussion thread could work too.
OK.
Side note: I saw you have MIG_EVENT_PRECOPY_CPR_SETUP in you own tree, I
wonder whether we could reuse MIG_EVENT_PRECOPY_SETUP by moving it earlier
in qmp_migrate(). After all CPR-* notifiers are already registered
separately with the list of migration_state_notifiers[], so I suppose it'll
service the same purpose. But we can discuss that later.
Sure, we can discuss later (and I'll take another look before posting the
vhost/tap
series).
- Steve