Introduction ------------ This series is based on the RFC series submitted by Yui Washizu[1]. See also [2] for the context.
This series enables SR-IOV emulation for virtio-net. It is useful to test SR-IOV support on the guest, or to expose several vDPA devices in a VM. vDPA devices can also provide L2 switching feature for offloading though it is out of scope to allow the guest to configure such a feature. The PF side code resides in virtio-pci. The VF side code resides in the PCI common infrastructure, but it is restricted to work only for virtio-net-pci because of lack of validation. User Interface -------------- A user can configure a SR-IOV capable virtio-net device by adding virtio-net-pci functions to a bus. Below is a command line example: -netdev user,id=n -netdev user,id=o -netdev user,id=p -netdev user,id=q -device pcie-root-port,id=b -device virtio-net-pci,bus=b,addr=0x0.0x3,netdev=q,sriov-pf=f -device virtio-net-pci,bus=b,addr=0x0.0x2,netdev=p,sriov-pf=f -device virtio-net-pci,bus=b,addr=0x0.0x1,netdev=o,sriov-pf=f -device virtio-net-pci,bus=b,addr=0x0.0x0,netdev=n,id=f The VFs specify the paired PF with "sriov-pf" property. The PF must be added after all VFs. It is user's responsibility to ensure that VFs have function numbers larger than one of the PF, and the function numbers have a consistent stride. Keeping VF instances -------------------- A problem with SR-IOV emulation is that it needs to hotplug the VFs as the guest requests. Previously, this behavior was implemented by realizing and unrealizing VFs at runtime. However, this strategy does not work well for the proposed virtio-net emulation; in this proposal, device options passed in the command line must be maintained as VFs are hotplugged, but they are consumed when the machine starts and not available after that, which makes realizing VFs at runtime impossible. As an strategy alternative to runtime realization/unrealization, this series proposes to reuse the code to power down PCI Express devices. When a PCI Express device is powered down, it will be hidden from the guest but will be kept realized. This effectively implements the behavior we need for the SR-IOV emulation. Summary ------- Patch [1, 5] refactors the PCI infrastructure code. Patch [6, 10] adds user-created SR-IOV VF infrastructure. Patch 11 makes virtio-pci work as SR-IOV PF for user-created VFs. Patch 12 allows user to create SR-IOV VFs with virtio-net-pci. [1] https://patchew.org/QEMU/1689731808-3009-1-git-send-email-yui.wash...@gmail.com/ [2] https://lore.kernel.org/all/5d46f455-f530-4e5e-9ae7-13a2297d4...@daynix.com/ Co-developed-by: Yui Washizu <yui.wash...@gmail.com> Signed-off-by: Akihiko Odaki <akihiko.od...@daynix.com> --- Changes in v2: - Changed to keep VF instances. - Link to v1: https://lore.kernel.org/r/20231202-sriov-v1-0-32b3570f7...@daynix.com --- Akihiko Odaki (12): hw/pci: Initialize PCI multifunction after realization hw/pci: Determine if rombar is explicitly enabled hw/pci: Do not add ROM BAR for SR-IOV VF vfio: Avoid inspecting option QDict for rombar hw/qdev: Remove opts member pcie_sriov: Reuse SR-IOV VF device instances pcie_sriov: Release VFs failed to realize pcie_sriov: Ensure PF and VF are mutually exclusive pcie_sriov: Check PCI Express for SR-IOV PF pcie_sriov: Allow user to create SR-IOV device virtio-pci: Implement SR-IOV PF virtio-net: Implement SR-IOV VF docs/pcie_sriov.txt | 8 +- include/hw/pci/pci.h | 2 +- include/hw/pci/pci_device.h | 13 +- include/hw/pci/pcie_sriov.h | 25 ++- include/hw/qdev-core.h | 4 - hw/core/qdev.c | 1 - hw/net/igb.c | 3 +- hw/nvme/ctrl.c | 3 +- hw/pci/pci.c | 98 +++++++----- hw/pci/pci_host.c | 4 +- hw/pci/pcie.c | 4 +- hw/pci/pcie_sriov.c | 360 +++++++++++++++++++++++++++++++++----------- hw/vfio/pci.c | 3 +- hw/virtio/virtio-net-pci.c | 1 + hw/virtio/virtio-pci.c | 7 + system/qdev-monitor.c | 12 +- 16 files changed, 395 insertions(+), 153 deletions(-) --- base-commit: 4705fc0c8511d073bee4751c3c974aab2b10a970 change-id: 20231202-sriov-9402fb262be8 Best regards, -- Akihiko Odaki <akihiko.od...@daynix.com>