On 10/10/2024 5:23 PM, Peter Xu wrote:
On Thu, Oct 10, 2024 at 04:06:13PM -0400, Steven Sistare wrote:
vhost requires us to stop the vm early:
qmp_migrate
stop vm
migration_call_notifiers MIG_EVENT_PRECOPY_CPR_SETUP
vhost_cpr_notifier
vhost_reset_device - must be after stop vm
- and before new qemu inits devices
cpr_state_save
unblocks new qemu which inits devices and calls vhost_set_owner
Thus config commands must be sent to the target during the guest pause interval
:(
I can understand it needs VM stopped, but it can still happen after
cpr_save(), am I right (IOW, fd wont change in the notifier)? I meant
below sequence:
- src: cpr_save(), when running, NONE->SETUP_CPR, all fds synced
- [whatever happens..]
- src: finally decide to switchover, vm stop
- vhost notifier invoked. PS: it doesn't require to be named SETUP_CPR
notifiers here, but something else..
The problem is that the first step, cpr_save, causes the dest to finish
cpr_load_state
and proceed to initialize devices in qemu_create_late_backends ->
net_init_clients.
This calls ioctl VHOST_SET_OWNER which fails because the device is still owned
by src qemu.
src qemu releases ownership via VHOST_RESET_OWNER in the vhost notifier.
Thus the guest must be paused while config commands are sent to the target.
We could avoid that with any of:
* do not issue config commands
* precreate phase
* cpr-exec mode
* only pause if vhost is present. (eg no pause for vfio).
- Steve