On Mon, Oct 07, 2024 at 04:52:43PM -0400, Steven Sistare wrote:
> On 10/7/2024 11:27 AM, Peter Xu wrote:
> > On Mon, Sep 30, 2024 at 12:40:35PM -0700, Steve Sistare wrote:
> > > Stop the vm earlier for cpr, to guarantee consistent device state when
> > > CPR state is saved.
> > 
> > Could you add some more info on why this order matters?
> > 
> > E.g., qmp_migrate should switch migration state machine to SETUP, while
> > this path holds BQL, I think it means there's no way devices got hot added
> > concurrently of the whole process.
> > 
> > Would other things change in the cpr states (name, fd, etc.)?  It'll be
> > great to mention these details in the commit message.
> 
> Because of the new cpr-state save operation needed by this mode,
> I created this patch to be future proof.  Performing a save operation while
> the machine is running is asking for trouble.  But right now, I am not aware
> of any specific issues.
> 
> Later in the "tap and vhost" series there is another reason to stop the vm 
> here and
> save cpr state, because the devices must be stopped in old qemu before they
> are initialized in new qemu.  If you are curious, see the 2 patches I attached
> to the email at
>   
> https://lore.kernel.org/qemu-devel/fa95c40d-b5e5-41eb-bba7-7842bca2f...@oracle.com/
> But, that has nothing to do with the contents of cpr state.

Then I suggest we leave this patch to the vhost/tap series, then please
document clearly in the commit mesasge on why this is needed.  Linking to
that discussion thread could work too.

Side note: I saw you have MIG_EVENT_PRECOPY_CPR_SETUP in you own tree, I
wonder whether we could reuse MIG_EVENT_PRECOPY_SETUP by moving it earlier
in qmp_migrate().  After all CPR-* notifiers are already registered
separately with the list of migration_state_notifiers[], so I suppose it'll
service the same purpose.  But we can discuss that later.

-- 
Peter Xu


Reply via email to