On 10/25/2024 9:55 AM, Peter Xu wrote:
On Thu, Oct 24, 2024 at 05:12:05PM -0400, Steven Sistare wrote:
On 10/10/2024 5:23 PM, Peter Xu wrote:
On Thu, Oct 10, 2024 at 04:06:13PM -0400, Steven Sistare wrote:
vhost requires us to stop the vm early:
qmp_migrate
stop vm
migration_call_notifiers MIG_EVENT_PRECOPY_CPR_SETUP
vhost_cpr_notifier
vhost_reset_device - must be after stop vm
- and before new qemu inits devices
cpr_state_save
unblocks new qemu which inits devices and calls vhost_set_owner
Thus config commands must be sent to the target during the guest pause interval
:(
I can understand it needs VM stopped, but it can still happen after
cpr_save(), am I right (IOW, fd wont change in the notifier)? I meant
below sequence:
- src: cpr_save(), when running, NONE->SETUP_CPR, all fds synced
- [whatever happens..]
- src: finally decide to switchover, vm stop
- vhost notifier invoked. PS: it doesn't require to be named SETUP_CPR
notifiers here, but something else..
The problem is that the first step, cpr_save, causes the dest to finish
cpr_load_state
and proceed to initialize devices in qemu_create_late_backends ->
net_init_clients.
This calls ioctl VHOST_SET_OWNER which fails because the device is still owned
by src qemu.
src qemu releases ownership via VHOST_RESET_OWNER in the vhost notifier.
I think the block drives have similar issue before on ownership when disk
is shared on both sides, and that ownership was only passed over to dest
until switchover, rather than dest qemu init. In the CPR routines it'll be
also during switchover rather than cpr_save().
Maybe it's just harder for vhost, as I assume vhost was never designed to
work with using in shared mode. Otherwise logically the net_init_clients()
could do the rest initialization, but provide a facility to SET_OWNER at a
later point. I'm not sure if it's possible.
net_init_clients cannot do any initialization that issues vhost ioctls,
because the dest process does not yet own the vhost device.
- Steve
For block it could be easier, IIRC it was mostly about the file lock and
who owns it (e.g. on a NFS share, to make sure no concurrent writters to
corrupt the file).
Thus the guest must be paused while config commands are sent to the target.
We could avoid that with any of:
* do not issue config commands
* precreate phase
* cpr-exec mode
* only pause if vhost is present. (eg no pause for vfio).
OK. I hope precreate will work out if that can solve this too.
Thanks,