------- Comment From mjros...@us.ibm.com 2023-07-18 09:54 EDT------- Hi Frank -- Thanks alot. Here's some information about the testing I intend to do (I will duplicate it to the QEMU feature), let me know if you have any questions or if you need more details. I don't see a QEMU package available for testing yet so if need be I can use upstream QEMU to verify the kernel.
Testing will consist of the following (all on s390): Hardware used: z14 or greater LPAR, PCI-attached devices (RoCE VFs, ISM devices, NVMe drive) Setup: Both the kernel and QEMU features are needed for the feature to function (an upstream QEMU can be used to verify the kernel early), and the facility is only avaialble on z14 or newer. When any of those pieces is missing, the interpretation facility will not be used. When both the kernel and QEMU features are included in their respective packages, and running in an LPAR on a z14 or newer machine, this feature will be enabled automatically. Existing supported devices should behave as before with no changes required by an end-user (e.g. no changes to libvirt domain definitions) -- but will now make use of the interpretation facility. Additionally, ISM devices will now be eligible for vfio-pci passthrough (where before QEMU would exit on error if attempting to provide an ISM device for vfio-pci passthrough, preventing the guest from starting) Testing will include the following scenarios, repeated each for RoCE, ISM and NVMe: 1) Testing of basic device passthrough (create a VM with a vfio-pci device as part of the libvirt domain definition, passing through a RoCE VF, an ISM device, or an NVMe drive. Verify that the device is available in the guest and functioning) 2) Testing of device hotplug/unplug (create a VM with a vfio-pci device, virsh detach-device to remove the device from the running guest, verify the device is removed from the guest, then virsh attach-device to hotplug the device to the guest again, verify the device functions in the guest) 3) Host power off testing: Power off the device from the host, verify that the device is unplugged from the guest as part of the poweroff 4) Guest power off testing: Power off the device from within the guest, verify that the device is unusuable in the guest, power the device back on within the guest and verify that the device is once again usable. 5) Guest reboot testing: (create a VM with a vfio-pci device, verify the device is in working condition, reboot the guest, verify that the device is still usable after reboot) Testing will include the following scenarios specifically for ISM devices: 1) Testing of SMC-D v1 fallback: Using 2 ISM devices on the same VCHID that share a PNETID, create 2 guests and pass one ISM device via vfio- pci device to each guest. Establish TCP connectivity between the 2 guests using the libvirt default network, and then use smc_run (https://manpages.ubuntu.com/manpages/jammy/man8/smc_run.8.html) to run an iperf workload between the 2 guests (will include both short workloads and longer-running workloads). Verify that SMC-D transfer was used between the guests instead of TCP via 'smcd stats' (https://manpages.ubuntu.com/manpages/jammy/man8/smcd.8.html) 2) Testing of SMC-D v2: Same as above, but using 2 ISM devices on the same VCHID that have no PNETID specified Testing will include the following scenarios specifically for RoCE devices: 1) Ping testing: Using 2 RoCE VFs that share a common network, create 2 guests and pass one RoCE device to each guest. Assign IP addresses within each guest to the associated TCP interface, perform a ping between the guests to verify connectivity. 2) Iperf testing: Similar to the above, but instead establish an iperf connection between the 2 guests and verify that the workload is successful / no errors. Will include both short workloads and longer- running workloads. Testing will include the following scenario specifically for NVMe devices: 1) Fio testing: Using a NVMe drive passed to the guest via vfio-pci, run a series of fio tests against the device from within the guest, verifying that the workload is successful / no errors. Will include both short workloads and longer-running workloads. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1853306 Title: [22.04 FEAT] Enhanced Interpretation for PCI Functions on s390x - kernel part Status in Ubuntu on IBM z Systems: Fix Released Status in linux package in Ubuntu: Fix Released Status in linux source package in Jammy: Fix Committed Status in linux source package in Kinetic: Won't Fix Status in linux source package in Lunar: Fix Released Status in linux source package in Mantic: Fix Released Bug description: [ Impact ] * Currently the PCI passthrough implementation for s390x is based on intercepting PCI I/O instructions, which leads to a reduced I/O performance compared to the execution of PCI instructions directly in LPAR. * Hence users may face I/O bottlenecks when using PCI devices in passthrough mode based on the current implementation. * For avoiding this and to improve performance, the interpretive execution of the PCI store and PCI load instructions get enabled. * A further improvement is achieved by enabling the Adapter-Event-Notification Interpretation (AENI). * Since LTS releases are the main focus for stable and long running KVM workloads, it is highly desired to get this backported to the jammy kernel (and because the next LTS is still some time away). [ Test Plan ] * Have an Ubuntu Server 22.04 installation on LPAR, that is able to access (ideally multiple) PCI devices, like RoCE Express (network) or NVMe disks. * Setup KVM and pass through (ideally multiple) of these PCI devices (that are otherwise unused on the KVM host). * Generate IO load on these passed through PCI devices, for example with stress-ng, using class network and/or device and/or io stressors. * This PR also introduces a new kernel config option 'VFIO_PCI_ZDEV_KVM' that allows to enable support for the s390x-specific extensions and enhancements to KVM passthrough, such as interpretive execution of zPCI instructions and is with this PR and got enabled. * The qemu autopkgtest (also needed due to LP#1853307) will be a got fit to identify any potential regressions, also in the kvm kernel area. * zPCI passthrough related test will be done by IBM. [ Where problems could occur ] * The modifications do not change the way users or APIs have to make use of PCI passthrough, only the internal implementation got modified. * The vast majority of the code changes/or additional code is s390x-specific, under arch/s390 and drivers/s390. * However there is also common code touched: * 'kvm: use kvfree() in kvm_arch_free_vm()' touches arch/arm64/include/asm/kvm_host.h, arch/arm64/kvm/arm.c, arch/x86/include/asm/kvm_host.h, arch/x86/kvm/x86.c, include/linux/kvm_host.h switches in kvm_arch_free_vm() from kfree() to kvfree() allowing to use the common variant, which is upstream since v5.16 and with that well established. * And 'vfio-pci/zdev: add open/close device hooks' touches drivers/vfio/pci/vfio_pci_core.c and drivers/vfio/pci/vfio_pci_zdev.c include/linux/vfio_pci_core.h add now code to introduce device hooks. It's upstream since kernel 6.0. * 'KVM: s390: pci: provide routines for en-/disabling interrupt forwarding' expands a single #if statement in include/linux/sched/user.h. * 'KVM: s390: add KVM_S390_ZPCI_OP to manage guest zPCI devices' adds s390x specific KVM_S390_ZPCI_OP and it's definition to include/uapi/linux/kvm.h. * And 'vfio-pci/zdev: different maxstbl for interpreted devices' and 'vfio-pci/zdev: add function handle to clp base capability' expand s390x-specific (aka z-specific aka zdev) device structs in include/uapi/linux/vfio_zdev.h. * This shows that the vast majority of modifications are s390x specific, even in most of the common code files. * The remaining modifications in the (generally) common code files are related to the newly introduced kernel option 'CONFIG_VFIO_PCI_ZDEV_KVM' and documentation. * The s390x changes are more significant, and could not only harm passthrough itself for zPCI devices, but also KVM virtualization in general. * In addition to these kernel changes, qemu modifications are needed as well (that are addressed at LP#1853307), this modified kernel must be tested in combination with the updated qemu package. - The qemu autopkgtest will be a got fit to identify any regressions, also in the kernel. - In addition some passthrough related test will be done by IBM __________ The PCI Passthrough implementation is based on intercepting PCI I/O instructions which leads to a reduced I/O performance compared to execution of PCI instructions in LPAR. For improved performance the interpretive execution of the PCI store and PCI load instructions get enabled. Further improvement is achieved by enabling the Adapter-Event-Notification To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/1853306/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp