Re: [RFC PATCH 00/18] Map memory at destination .load_setup in vDPA-net migration

Si-Wei Liu Thu, 02 Nov 2023 03:13:39 -0700



On 10/19/2023 7:34 AM, Eugenio Pérez wrote:

Current memory operations like pinning may take a lot of time at the

destination.  Currently they are done after the source of the migration is

stopped, and before the workload is resumed at the destination.  This is a

period where neigher traffic can flow, nor the VM workload can continue

(downtime).



We can do better as we know the memory layout of the guest RAM at the

destination from the moment the migration starts.  Moving that operation allows

QEMU to communicate the kernel the maps while the workload is still running in

the source, so Linux can start mapping them.  Ideally, all IOMMU is configured,

but if the vDPA parent driver uses on-chip IOMMU and .set_map we're still

saving all the pinning time.

I get what you want to say, though not sure how pinning is relevant toon-chip IOMMU and .set_map here, essentially pinning is required for allparent vdpa drivers that perform DMA hence don't want VM pages to movearound.




Note that further devices setup at the end of the migration may alter the guest

memory layout. But same as the previous point, many operations are still done

incrementally, like memory pinning, so we're saving time anyway.



The first bunch of patches just reorganizes the code, so memory related

operation parameters are shared between all vhost_vdpa devices.  This is

because the destination does not know what vhost_vdpa struct will have the

registered listener member, so it is easier to place them in a shared struct

rather to keep them in vhost_vdpa struct.  Future version may squash or omit

these patches.

It looks this VhostVDPAShared facility (patch 1-13) is also what I needin my SVQ descriptor group series [*], for which I've built similarconstruct there. If possible please try to merge this in ASAP. I'llrework my series on top of that.

[*]https://github.com/siwliu-kernel/qemu/commit/813518354af5ee8a6e867b2bf7dff3d6004fbcd5




Only tested with vdpa_sim. I'm sending this before full benchmark, as some work

like [1] can be based on it, and Si-Wei agreed on benchmark this series with

his experience.

Haven't done the full benchmark compared to pre-map at destination yet,though an observation is that the destination QEMU seems very easy toget stuck for very long time while in mid of pinning pages. During thisperiod, any client doing read-only QMP query or executing HMP infocommand got frozen indefinitely (subject to how large size the memory isbeing pinned). Is it possible to unblock those QMP request or HMPcommand from being executed (at least the read-only ones) while inmigration? Yield from the load_setup corourtine and spawn another thread?

Having said, not sure if .load_setup is a good fit for what we want todo. Searching all current users of .load_setup, either the job can bedone instantly or the task is time bound without trapping into kernelfor too long. Maybe pinning is too special use case here...


-Siwei




Future directions on top of this series may include:

* Iterative migration of virtio-net devices, as it may reduce downtime per [1].

   vhost-vdpa net can apply the configuration through CVQ in the destination

   while the source is still migrating.

* Move more things ahead of migration time, like DRIVER_OK.

* Check that the devices of the destination are valid, and cancel the migration

   in case it is not.



[1] 
https://lore.kernel.org/qemu-devel/6c8ebb97-d546-3f1c-4cdd-54e23a566...@nvidia.com/T/



Eugenio Pérez (18):

   vdpa: add VhostVDPAShared

   vdpa: move iova tree to the shared struct

   vdpa: move iova_range to vhost_vdpa_shared

   vdpa: move shadow_data to vhost_vdpa_shared

   vdpa: use vdpa shared for tracing

   vdpa: move file descriptor to vhost_vdpa_shared

   vdpa: move iotlb_batch_begin_sent to vhost_vdpa_shared

   vdpa: move backend_cap to vhost_vdpa_shared

   vdpa: remove msg type of vhost_vdpa

   vdpa: move iommu_list to vhost_vdpa_shared

   vdpa: use VhostVDPAShared in vdpa_dma_map and unmap

   vdpa: use dev_shared in vdpa_iommu

   vdpa: move memory listener to vhost_vdpa_shared

   vdpa: do not set virtio status bits if unneeded

   vdpa: add vhost_vdpa_load_setup

   vdpa: add vhost_vdpa_net_load_setup NetClient callback

   vdpa: use shadow_data instead of first device v->shadow_vqs_enabled

   virtio_net: register incremental migration handlers



  include/hw/virtio/vhost-vdpa.h |  43 +++++---

  include/net/net.h              |   4 +

  hw/net/virtio-net.c            |  23 +++++

  hw/virtio/vdpa-dev.c           |   7 +-

  hw/virtio/vhost-vdpa.c         | 183 ++++++++++++++++++---------------

  net/vhost-vdpa.c               | 127 ++++++++++++-----------

  hw/virtio/trace-events         |  14 +--

  7 files changed, 239 insertions(+), 162 deletions(-)

Re: [RFC PATCH 00/18] Map memory at destination .load_setup in vDPA-net migration

Reply via email to