QE tested this series with regression tests, there are no new regression issues.
Tested-by: Lei Yang <leiy...@redhat.com> On Sat, Dec 16, 2023 at 1:28 AM Eugenio Pérez <epere...@redhat.com> wrote: > > Current memory operations like pinning may take a lot of time at the > destination. Currently they are done after the source of the migration is > stopped, and before the workload is resumed at the destination. This is a > period where neigher traffic can flow, nor the VM workload can continue > (downtime). > > We can do better as we know the memory layout of the guest RAM at the > destination from the moment the migration starts. Moving that operation > allows > QEMU to communicate the kernel the maps while the workload is still running in > the source, so Linux can start mapping them. > > Also, the destination of the guest memory may finish before the destination > QEMU maps all the memory. In this case, the rest of the memory will be mapped > at the same time as before applying this series, when the device is starting. > So we're only improving with this series. > > If the destination has the switchover_ack capability enabled, the destination > hold the migration until all the memory is mapped. > > This needs to be applied on top of [1]. That series performs some code > reorganization that allows to map the guest memory without knowing the queue > layout the guest configure on the device. > > This series reduced the downtime in the stop-and-copy phase of the live > migration from 20s~30s to 5s, with a 128G mem guest and two mlx5_vdpa devices, > per [2]. > > Future directions on top of this series may include: > * Iterative migration of virtio-net devices, as it may reduce downtime per > [3]. > vhost-vdpa net can apply the configuration through CVQ in the destination > while the source is still migrating. > * Move more things ahead of migration time, like DRIVER_OK. > * Check that the devices of the destination are valid, and cancel the > migration > in case it is not. > > v1 from RFC v2: > * Hold on migration if memory has not been mapped in full with switchover_ack. > * Revert map if the device is not started. > > RFC v2: > * Delegate map to another thread so it does no block QMP. > * Fix not allocating iova_tree if x-svq=on at the destination. > * Rebased on latest master. > * More cleanups of current code, that might be split from this series too. > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-12/msg01986.html > [2] https://lists.nongnu.org/archive/html/qemu-devel/2023-12/msg00909.html > [3] > https://lore.kernel.org/qemu-devel/6c8ebb97-d546-3f1c-4cdd-54e23a566...@nvidia.com/T/ > > Eugenio Pérez (12): > vdpa: do not set virtio status bits if unneeded > vdpa: make batch_begin_once early return > vdpa: merge _begin_batch into _batch_begin_once > vdpa: extract out _dma_end_batch from _listener_commit > vdpa: factor out stop path of vhost_vdpa_dev_start > vdpa: check for iova tree initialized at net_client_start > vdpa: set backend capabilities at vhost_vdpa_init > vdpa: add vhost_vdpa_load_setup > vdpa: approve switchover after memory map in the migration destination > vdpa: add vhost_vdpa_net_load_setup NetClient callback > vdpa: add vhost_vdpa_net_switchover_ack_needed > virtio_net: register incremental migration handlers > > include/hw/virtio/vhost-vdpa.h | 32 ++++ > include/net/net.h | 8 + > hw/net/virtio-net.c | 48 ++++++ > hw/virtio/vhost-vdpa.c | 274 +++++++++++++++++++++++++++------ > net/vhost-vdpa.c | 43 +++++- > 5 files changed, 357 insertions(+), 48 deletions(-) > > -- > 2.39.3 > >