On Fri, Dec 8, 2023 at 2:50 AM Si-Wei Liu <si-wei....@oracle.com> wrote: > > This patch series contain several enhancements to SVQ live migration downtime > for vDPA-net hardware device, specifically on mlx5_vdpa. Currently it is based > off of Eugenio's RFC v2 .load_setup series [1] to utilize the shared facility > and reduce frictions in merging or duplicating code if at all possible. > > It's stacked up in particular order as below, as the optimization for one on > the top has to depend on others on the bottom. Here's a breakdown for what > each part does respectively: > > Patch # | Feature / optimization > ---------V------------------------------------------------------------------- > 35 - 40 | trace events > 34 | migrate_cancel bug fix > 21 - 33 | (Un)map batching at stop-n-copy to further optimize LM down time > 11 - 20 | persistent IOTLB [3] to improve LM down time > 02 - 10 | SVQ descriptor ASID [2] to optimize SVQ switching > 01 | dependent linux headers > V > > Let's first define 2 sources of downtime that this work is concerned with: > > * SVQ switching downtime (Downtime #1): downtime at the start of migration. > Time spent on teardown and setup for SVQ mode switching, and this downtime > is regarded as the maxium time for an individual vdpa-net device. > No memory transfer is involved during SVQ switching, hence no . > > * LM downtime (Downtime #2): aggregated downtime for all vdpa-net devices on > resource teardown and setup in the last stop-n-copy phase on source host. > > With each part of the optimizations applied bottom up, the effective outcome > in terms of down time (in seconds) performance can be observed in this table: > > > | Downtime #1 | Downtime #2 > --------------------+-------------------+------------------- > Baseline QEMU | 20s ~ 30s | 20s > | | > Iterative map | | > at destination[1] | 5s | 20s > | | > SVQ descriptor | | > ASID [2] | 2s | 5s > | | > | | > persistent IOTLB | 2s | 2s > [3] | | > | | > (Un)map batching | | > at stop-n-copy | 1.7s | 1.5s > before switchover | | > > (VM config: 128GB mem, 2 mlx5_vdpa devices, each w/ 4 data vqs)
This looks promising! But the series looks a little bit huge, can we split them into 2 or 3 series? It helps to speed up the reviewing and merging. Thanks > > Please find the details regarding each enhancement on the commit log. > > Thanks, > -Siwei > > > [1] [RFC PATCH v2 00/10] Map memory at destination .load_setup in vDPA-net > migration > https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg05711.html > [2] VHOST_BACKEND_F_DESC_ASID > https://lore.kernel.org/virtualization/20231018171456.1624030-2-dtatu...@nvidia.com/ > [3] VHOST_BACKEND_F_IOTLB_PERSIST > https://lore.kernel.org/virtualization/1698304480-18463-1-git-send-email-si-wei....@oracle.com/ > > --- > > Si-Wei Liu (40): > linux-headers: add vhost_types.h and vhost.h > vdpa: add vhost_vdpa_get_vring_desc_group > vdpa: probe descriptor group index for data vqs > vdpa: piggyback desc_group index when probing isolated cvq > vdpa: populate desc_group from net_vhost_vdpa_init > vhost: make svq work with gpa without iova translation > vdpa: move around vhost_vdpa_set_address_space_id > vdpa: add back vhost_vdpa_net_first_nc_vdpa > vdpa: no repeat setting shadow_data > vdpa: assign svq descriptors a separate ASID when possible > vdpa: factor out vhost_vdpa_last_dev > vdpa: check map_thread_enabled before join maps thread > vdpa: ref counting VhostVDPAShared > vdpa: convert iova_tree to ref count based > vdpa: add svq_switching and flush_map to header > vdpa: indicate SVQ switching via flag > vdpa: judge if map can be kept across reset > vdpa: unregister listener on last dev cleanup > vdpa: should avoid map flushing with persistent iotlb > vdpa: avoid mapping flush across reset > vdpa: vhost_vdpa_dma_batch_end_once rename > vdpa: factor out vhost_vdpa_map_batch_begin > vdpa: vhost_vdpa_dma_batch_begin_once rename > vdpa: factor out vhost_vdpa_dma_batch_end > vdpa: add asid to dma_batch_once API > vdpa: return int for dma_batch_once API > vdpa: add asid to all dma_batch call sites > vdpa: support iotlb_batch_asid > vdpa: expose API vhost_vdpa_dma_batch_once > vdpa: batch map/unmap op per svq pair basis > vdpa: batch map and unmap around cvq svq start/stop > vdpa: factor out vhost_vdpa_net_get_nc_vdpa > vdpa: batch multiple dma_unmap to a single call for vm stop > vdpa: fix network breakage after cancelling migration > vdpa: add vhost_vdpa_set_address_space_id trace > vdpa: add vhost_vdpa_get_vring_base trace for svq mode > vdpa: add vhost_vdpa_set_dev_vring_base trace for svq mode > vdpa: add trace events for eval_flush > vdpa: add trace events for vhost_vdpa_net_load_cmd > vdpa: add trace event for vhost_vdpa_net_load_mq > > hw/virtio/trace-events | 9 +- > hw/virtio/vhost-shadow-virtqueue.c | 35 ++- > hw/virtio/vhost-vdpa.c | 156 +++++++--- > include/hw/virtio/vhost-vdpa.h | 16 + > include/standard-headers/linux/vhost_types.h | 13 + > linux-headers/linux/vhost.h | 9 + > net/trace-events | 8 + > net/vhost-vdpa.c | 434 > ++++++++++++++++++++++----- > 8 files changed, 558 insertions(+), 122 deletions(-) > > -- > 1.8.3.1 >