On Thu, Feb 24, 2022 at 01:58:31PM +0100, Jan Beulich wrote: > On 24.02.2022 13:43, Roger Pau Monne wrote: > > Introduce a new field to mark devices as broken: having it set > > prevents the device from being assigned to domains. Use the field in > > order to mark ATS devices that have failed a flush as broken, thus > > preventing them to be assigned to any guest. > > > > This allows the device IOMMU context entry to be cleaned up properly, > > as calling _pci_hide_device will just change the ownership of the > > device, but the IOMMU context entry of the device would be left as-is. > > It would also leak a Domain ID, as removing the device from it's > > previous owner will allow releasing the DID used by the device without > > having cleaned up the context entry. > > > > Signed-off-by: Roger Pau Monné <roger....@citrix.com> > > --- > > RFC: I haven't tested the code path, as I have no ATS devices on the > > box I'm currently testing on. In any case, ATS is not supported, and > > removing the call to _pci_hide_device in iommu_dev_iotlb_flush_timeout > > should allow to remove the dependency on recursive pcidevs lock. > > No objection in principle. Whether this is the only dependency on > recursive pcidevs lock isn't really know though, is it?
Indeed. I didn't word this clearly. The recursive lock was introduced for this specific use case. Whether we have gained more recursive paths in the meantime I haven't assessed. > > TBD: it's unclear whether we still need the pcidevs_lock in > > iommu_dev_iotlb_flush_timeout. The caller of > > iommu_dev_iotlb_flush_timeout is already bogus as it iterates over a > > list of pdevs without holding the pcidevs_lock. > > Analysis of whether / where recursive uses are needed should imo > include cases where the lock ought to be held, but currently isn't > (like apparently this case). Well, I'm not opposed to that. I think just aiming to get the current usages analyzed will already be hard. > > @@ -1487,6 +1487,11 @@ static int assign_device(struct domain *d, u16 seg, > > u8 bus, u8 devfn, u32 flag) > > ASSERT(pdev && (pdev->domain == hardware_domain || > > pdev->domain == dom_io)); > > > > + /* Do not allow broken devices to be assigned. */ > > + rc = -EBADF; > > + if ( pdev->broken ) > > + goto done; > > I think this wants exceptions for Dom0 and DomIO. An admin may be > able to fix things in Dom0, e.g. by updating device firmware. Doesn't a device get assigned to DomIO (or Dom0 if not using quarantine mode), and then when deassigned from DomIO gets assigned to Dom0? So there's no usage of assign_device in the path that (re)assigns a device used by a guest into Dom0? Or would you rather imply that pdev->broken should get cleared at some point (ie: when the device is assigned back to Dom0)? Thanks, Roger.