Patches submitted to kernel-team mailing list: https://lists.ubuntu.com/archives/kernel-team/2024-December/155853.html.
SRU Justification [Impact] The patch "vfio/pci: Use unmap_mapping_range()" rewrote the way VFIO tracks mapped regions to use the "vmf_insert_pfn" function instead of tracking them itself and using "io_remap_pfn_range". The implementation using "vmf_insert_pfn" is significantly slower. To mitigate this slowdown, "vfio/pci: Insert full vma on mmap'd MMIO fault" was introduced to prefault the entirety of areas mapped by vfio_pci, resulting in soft lockup warnings on the host for large BAR region devices. Reverting this prefaulting behavior does not fully resolve the slowness, as a VM still experiences extremely slow accesses to the passthrough devices as VMAs get faulted in, causing soft lockup warnings in the guest during boot. Thus, "vfio/pci: Use unmap_mapping_range()" must also be reverted to restore performance to that of versions prior to 6.8.0-48-generic. [Fix] Both of these performance issues are resolved upstream by patchset [1], but this would be a complex backport to 6.8 and 6.11, with significant changes to core parts of the kernel. Reverting the following commits resolves the issue, with a much reduced potential for regression: - "mm: use rwsem assertion macros for mmap_lock" (revert needed in Oracular, not present in Noble) - "vfio/pci: Insert full vma on mmap'd MMIO fault" - "vfio/pci: Use unmap_mapping_range()" [Test Plan] Tested on a DGX H100 system, verified to reduce VM start time with 8 passthrough H100 GPUs from 45 minutes back down to 5 minutes and eliminate the soft lockup warnings. Reproduced using a libvirt VM, created with: $ sudo virt-install --connect qemu:///system -v --name gpu-pt-test \ --memory 16384 --vcpus 16 --cpu host --cdrom \ /ubuntu-24.04.1-live-server-amd64.iso --os-variant ubuntu24.04 \ --disk size=512 -w bridge=virbr0 --graphics none \ --console pty,target.type=virtio \ --hostdev pci_0000_1b_00_0 --hostdev pci_0000_43_00_0 \ --hostdev pci_0000_52_00_0 --hostdev pci_0000_61_00_0 \ --hostdev pci_0000_9d_00_0 --hostdev pci_0000_d1_00_0 \ --hostdev pci_0000_df_00_0 --hostdev pci_0000_c3_00_0 [Where problems could occur] The reverts here primarily affect the vfio_pci driver. However, in Oracular "mm: use rwsem assertion macros for mmap_lock" is also reverted. This could result in misbehavior of the vfio_pci driver. In Oracular, it could also result in mmap locking bugs going undetected unless testing is done with lockdep enabled. [1] https://patchwork.kernel.org/project/linux-mm/list/?series=883517 ** Changed in: linux-nvidia (Ubuntu Noble) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2089306 Title: vfio_pci soft lockup on VM start while using PCIe passthrough To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2089306/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs