On Wed, Dec 11, 2024 at 03:21:37PM +0000, Shameerali Kolothum Thodi wrote: > > Once a device reaches to the pci_device_set_iommu_device() call, > > it should be attached to an IDENTIY/bypass proxy s1_hwpt, so the > > smmu_find_add_as() will return the iommu as. > > Agree. The above situation you explained is perfectly fine with vfio-pci dev. > > > So, the fact that your hack is working means the hotplug routine > > is likely missing a pci_device_set_iommu_device() call, IMHO, or > > probably it should do pci_device_iommu_address_space() after the > > device finishes pci_device_set_iommu_device() instead.. > > The problem is not with the hot added vfio-pci dev but with the > pcie-root-port device. When we hot add a vfio-pci to a root port, > Qemu will inject an interrupt for the Guest root port device and > that kick starts the vfio-pci device add process. This involves writing > to the MSI address the Guest kernel configures for the root port dev. > > As per the current logic, the root port dev will have sysmem address > space and in IORT we have root port dev id in smmu idmap. This > will not work as Guest kernel configures a translated IOVA for MSI. > > I think we have discussed this issue of returning different address > spaces before here[0]. But that was in a different context though. > The hack mentioned in [0] actually works for this case as well, where > we add an extra check to see the dev is vfio-pci or not. But I am not > sure that is the best way to handle this. > > Another option is to exclude all the root port devices from IORT idmap. > But that looks not an ideal one to me as it actually sits behind an SMMUv3 > in this case. > > Please let me know if you have any ideas.
Oh... I completely forgot that... So, we need to make sure the sdev/PCIDevice is a passthrough dev that will go through the set_iommu_device callback. Otherwise, just return the iommu address space. Perhaps we could set a flag during vfio_realize() in PCIDevice * pdev, so later we could cast the sdev to pdev and recheck that. Or, we could do something like your approach: ----------------------------------------------------------------- @@ -896,9 +896,11 @@ static AddressSpace *smmu_find_add_as(PCIBus *bus, void *opaque, int devfn) SMMUState *s = opaque; SMMUPciBus *sbus = smmu_get_sbus(s, bus); SMMUDevice *sdev = smmu_get_sdev(s, sbus, bus, devfn); + PCIDevice *pdev = pci_find_device(bus, pci_bus_num(bus), devfn); + bool has_iommufd = !!object_property_find(OBJECT(pdev), "iommufd"); /* Return the system as if the device uses stage-2 only */ - if (s->nested && !sdev->s1_hwpt) { + if (s->nested && !sdev->s1_hwpt && has_iommufd) { return &sdev->as_sysmem; } else { return &sdev->as; ----------------------------------------------------------------- vfio-pci might not guarantee that it has an "iommufd" property so checking the property explicitly might be nicer. Thanks Nic