On Wed, 18 Jan 2023 18:03:13 +0000 Jean-Philippe Brucker <jean-phili...@linaro.org> wrote:
> On Fri, Jan 13, 2023 at 10:57:00AM -0700, Alex Williamson wrote: > > On Fri, 13 Jan 2023 12:39:18 +0000 > > Jean-Philippe Brucker <jean-phili...@linaro.org> wrote: > > > > > Hi, > > > > > > On Mon, Jan 09, 2023 at 10:11:19PM +0100, Eric Auger wrote: > > > > > Jean, do you have any idea about how to fix that? Do you think we > > > > > have a > > > > > trouble in the acpi/viot setup or virtio-iommu probe sequence. It > > > > > looks > > > > > like virtio probe and attach commands are called too early, before the > > > > > bus is actually correctly numbered. > > > > > > > > So after further investigations looks this is not a problem of bus > > > > number, which is good at the time of the virtio cmd calls but rather a > > > > problem related to the devfn (0 was used when creating the IOMMU MR) > > > > whereas the virtio-iommu cmds looks for the non aliased devfn. With that > > > > fixed, the probe and attach at least succeeds. The device still does not > > > > work for me but I will continue my investigations and send a tentative > > > > fix. > > > > > > If I remember correctly VIOT can deal with bus numbers because bridges are > > > assigned a range by QEMU, but I haven't tested that in detail, and I don't > > > know how it holds with conventional PCI bridges. > > > > In my reading of the virtio-iommu spec, > > Hm, is that the virtio-iommu spec or ACPI VIOT/device tree spec? > The virtio-iommu spec shouldn't refer to PCI buses at the moment. The > intent is that for PCI, the "endpoint ID" passed in an ATTACH request > corresponds to PCI segment and RID of PCI devices at the time of the > request (so after the OS renumbered the buses). If you found something in > the spec that contradicts this, it should be fixed. Note that "endpoint" > is a misnomer, it can refer to PCI bridges as well, anything that can > issue DMA transactions. Sorry, the ACPI spec defining the VIOT table[1]: Each node identifies one or more devices using either their PCI Handle or their base MMIO (Memory-Mapped I/O) address. A PCI Handle is a PCI Segment number and a BDF (Bus-Device-Function) with the following layout: * Bits 15:8 Bus Number * Bits 7:3 Device Number * Bits 2:0 Function Number This identifier corresponds to the one observed by the operating system when parsing the PCI configuration space for the first time after boot. > > I noted that it specifies the > > bus numbers *at the time of OS handoff*, so it essentially washes its > > hands of the OS renumbering buses while leaving subtle dependencies on > > initial numbering in the guest and QEMU implementations. > > Yes we needed to describe in the firmware tables (device-tree and ACPI > VIOT) which devices the IOMMU manages. And at the time we generate the > tables, if we want to refer to PCI devices behind bridges, we can either > use catch-all ranges for any possible bus numbers they will get, or > initialize bus numbers in bridges and pass those to the OS. > > But that's only to communicate the IOMMU topology to the OS, because we > couldn't come up with anything better. After it sets up PCI the OS should > be able to use its own configuration of the PCI topology in virtio-iommu > requests. The VT-d spec[2](8.3.1) has a more elegant solution using a path described in a device scope, based on a root bus number (not susceptible to OS renumbering) and a sequence of devfns to uniquely describe a hierarchy or endpoint, invariant of OS bus renumbering. Thanks, Alex [1]https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#virtual-i-o-translation-viot-table-header [2]https://cdrdv2-public.intel.com/671081/vt-directed-io-spec.pdf