Hi Dann, thanks a lot for your logs and pretty great bug report here.
The thread with edk2 folks is really informative!

I've been doing some research on this topic, and will share here in order to 
document it.
So, first thing is about "pci=nocrs" "pci=realloc". When setting "pci=nocrs", 
we are telling kernel to disregard ACPI resource information (CRS == Current 
Resource Settings object). In this mode, all memory is available to PCI Host 
Bridge allocations except the RAM and other detected reservations, so it 
bypasses the limitations of the OVMF firmware. This mode was the default 10 
years ago, but it was changed due to some incompatibilities, for example 
systems with more than 1 PCI host bridge - the PCI subsystem maintainer decided 
then to "trust" more in the FW resource mapping, and allowed a kernel fallback 
through the option "nocrs". This is well-explained in [0] and [1].

The option "pci=realloc" is somewhat orthogonal to it; basically it
allows kernel to perform PCI endpoints (aka, devices) memory
(re-)assignments under their PCI host bridge memory space. So, an
analogy would be: the PCI host bridge resource is a pile of memory in
which devices will take some and consume for their BARs. With
"pci=realloc", we allow kernel to retry this memory mapping for PCI
devices some times, until it works (or eventually fail). It's natural to
use "pci=realloc" and this option is somewhat automatic, due to kernel
build-time configuration PCI_REALLOC_ENABLE_AUTO, which is default in
Ubuntu kernels. In summary, "pci=realloc" is the way the memory of PCI
host bridge is distributed to the PCI devices.

Now, regarding the firmware differences between OVMF and seabios. As per
the ed2k thread mentioned in the above comment, OVMF has a strict
limitation of the PCI64 aperture size. In seabios, things are a bit
different - the ACPI table passed to Linux containing the PCI64 aperture
information is DSST, this table is built dynamically based on SSDT
construction on boot time (build_ssdt() on seabios code). This is
ultimately based on PCI initialization routines that construct the BARs'
sizes and sum all of them, given the information in the PCI devices'
configuration space. The functions involved in this process are:

pci_setup() -> pci_bios_check_devices()/pci_bios_map_devices()

There's no limit on the aperture size, which is variable and can
accommodate as many devices the guest memory allows. In a way, this is
similar to the way Linux would perform the PCI resource allocations with
"pci=nocrs" parameter.

Now, OVMF is more complex in nature. The source tree of OVMF is composed
by multiple modules. The module MdeModulePkg is responsible for the PCI
enumeration for OVMF. There are 2 parts involved in that:

- the aperture is calculated on submodule PciHostBridgeDxe; it comes
from the early portions of the firmware code (submodule
OvmfPkg/PlatformPei), in the memory detection routine (and in that point
we can hijack into it using the experimental parameter X-PciMmio64Mb).
This is then passed to PciHostBridgeDxe which will create a bridge with
the memory resources' limits set.

- The PCI enumeration itself (and specially the device dropping in case the 
aperture is exceeded) comes in the submodule PciBusDxe, through the following 
functions: 
PciBusDriverBindingStart() -> PciEnumerator() -> 
PciHostBridgeResourceAllocator()

The function PciHostBridgeResourceAllocator() is the one that tries to allocate 
effectively the memory through what's called Global Coherency Domain (GCD), the 
edk2/UEFI generic memory/IO manager. It's done in the PCI Bridge "level" and if 
it fails due to lack of resources then it'll go through the following functions 
to free resources in the bridge:
PciHostBridgeAdjustAllocation() -> GetMaxResourceConsumerDevice()

In this point, the GPU is discarded on benefit of other devices in case
its BAR is too large based on the limitation of OVMF PCI64 aperture. For
reference, this is the edk2/OVMF commit that limits by default the PCI64
aperture size: 7e5b1b670c ("OvmfPkg: PlatformPei: determine the 64-bit
PCI host aperture for X64 DXE)

Cheers,


Guilherme


[0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/341681/comments/97
[1] https://bugzilla.kernel.org/show_bug.cgi?id=14183

** Bug watch added: Linux Kernel Bug Tracker #14183
   https://bugzilla.kernel.org/show_bug.cgi?id=14183

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1849563

Title:
  Unable to passthrough GPUs to guest

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/edk2/+bug/1849563/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to