Hi, This series of patches is an attempt to add support for the following sections of NVMe specification revision 1.4:
8.5 Virtualization Enhancements (Optional) 8.5.1 VQ Resource Definition 8.5.2 VI Resource Definition 8.5.3 Secondary Controller States and Resource Configuration 8.5.4 Single Root I/O Virtualization and Sharing (SR-IOV) The NVMe controller's Single Root I/O Virtualization and Sharing implementation is based on patches introducing SR-IOV support for PCI Express proposed by Knut Omang: https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg05155.html However, based on what I was able to find historically, Knut's patches have not yet been pulled into QEMU due to no example of a working device up to this point: https://lists.gnu.org/archive/html/qemu-devel/2017-10/msg02722.html In terms of design, the Physical Function controller and the Virtual Function controllers are almost independent, with few exceptions: PF handles flexible resource allocation for all its children (VFs have read-only access to this data), and reset (PF explicitly calls it on VFs). Since the MMIO access is serialized, no extra precautions are required to handle concurrent resets, as well as the secondary controller state access doesn't need to be atomic. A controller with full SR-IOV support must be capable of handling the Namespace Management command. As there is a pending review with this functionality, this patch list is not duplicating efforts. Yet, NS management patches are not required to test the SR-IOV support. We tested the patches on Ubuntu 20.04.3 LTS with kernel 5.4.0. We have hit various issues with NVMe CLI (list and virt-mgmt commands) between releases from version 1.09 to master, thus we chose this golden NVMe CLI hash for testing: a50a0c1. The implementation is not 100% finished and certainly not bug free, since we are already aware of some issues e.g. interaction with namespaces related to AER, or unexpected (?) kernel behavior in more complex reset scenarios. However, our SR-IOV implementation is already able to support typical SR-IOV use cases, so we believe the patches are ready to share with the community. Hope you find some time to review the work we did, and share your thoughts. Kind regards, Lukasz Knut Omang (3): pcie: Set default and supported MaxReadReq to 512 pcie: Add support for Single Root I/O Virtualization (SR/IOV) pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt Lukasz Maniak (5): pcie: Add callback preceding SR-IOV VFs update hw/nvme: Add support for SR-IOV hw/nvme: Add support for Primary Controller Capabilities hw/nvme: Add support for Secondary Controller List docs: Add documentation for SR-IOV and Virtualization Enhancements Ćukasz Gieryk (7): pcie: Add 1.2 version token for the Power Management Capability hw/nvme: Implement the Function Level Reset hw/nvme: Make max_ioqpairs and msix_qsize configurable in runtime hw/nvme: Calculate BAR atributes in a function hw/nvme: Initialize capability structures for primary/secondary controllers pcie: Add helpers to the SR/IOV API hw/nvme: Add support for the Virtualization Management command docs/pcie_sriov.txt | 115 +++++++ docs/system/devices/nvme.rst | 27 ++ hw/nvme/ctrl.c | 589 ++++++++++++++++++++++++++++++++--- hw/nvme/ns.c | 2 +- hw/nvme/nvme.h | 47 ++- hw/nvme/subsys.c | 74 ++++- hw/nvme/trace-events | 6 + hw/pci/meson.build | 1 + hw/pci/pci.c | 97 ++++-- hw/pci/pcie.c | 10 +- hw/pci/pcie_sriov.c | 313 +++++++++++++++++++ hw/pci/trace-events | 5 + include/block/nvme.h | 65 ++++ include/hw/pci/pci.h | 12 +- include/hw/pci/pci_ids.h | 1 + include/hw/pci/pci_regs.h | 1 + include/hw/pci/pcie.h | 6 + include/hw/pci/pcie_sriov.h | 81 +++++ include/qemu/typedefs.h | 2 + 19 files changed, 1369 insertions(+), 85 deletions(-) create mode 100644 docs/pcie_sriov.txt create mode 100644 hw/pci/pcie_sriov.c create mode 100644 include/hw/pci/pcie_sriov.h -- 2.25.1