** Changed in: linux (Ubuntu Jammy)
Assignee: (unassigned) => Keifer Snedeker (ks0)
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2097389
Title:
VM boots slowly with large-BAR GPU Passthrough due to pci/probe.c
redundancy
Status in linux package in Ubuntu:
Invalid
Status in linux source package in Jammy:
Fix Committed
Status in linux source package in Noble:
Fix Released
Status in linux source package in Oracular:
Fix Released
Bug description:
SRU Justification:
[ Impact ]
VM guests that have large-BAR GPUs passed through to them will take 2x
as long to initialize all device BARs without this patch
[ Test Plan ]
I verified that this patch applies cleanly to the Noble kernel
and resolves the bug on DGX H100 and DGX A100. I observed no regressions.
This can be verified on any machine with a sufficiently large BAR and the
capability to pass through to a VM using vfio.
To verify no regressions, I applied this patch to the guest kernel, then
rebooted and confirmed that:
1. The measured PCI initialization time on boot was ~50% of the unmodified
kernel
2. Relevant parts of /proc/iomem mappings, the PCI init section of dmesg
output, and lspci -vv output remained unchanged between the system with the
unmodified kernel and with the patched kernel
3. The Nvidia driver still successfully loaded and was shown via nvidia-smi
after the patch was applied
[ Fix ]
Roughly half of the time consuming device configuration options invoked during
the PCI probe function can be eliminated by rearranging the memory and I/O
disable/enable
calls such that they only occur per-device rather than per-BAR. This is what
the upstream
patch does, and it results in roughly half the excess initialization time
being eliminated
reliably during VM boot.
[ Where problems could occur ]
I do not expect any regressions. The only callers of ABIs changed by
this patch are also adjusted within this patch, and the functional
change only removes entirely redundant calls to disable/enable PCI
memory/IO.
[ Additional Context ]
Upstream patch:
https://lore.kernel.org/all/[email protected]/
Upstream bug report:
https://lore.kernel.org/all/cahta-uyp07fgm6t1ozqkqadsa5jrzo0reneyzgqzub4mdrr...@mail.gmail.com/
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2097389/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp