Introduce a new field to mark devices as broken: having it set prevents the device from being assigned to domains. Use the field in order to mark ATS devices that have failed a flush as broken, thus preventing them to be assigned to any guest.
This allows the device IOMMU context entry to be cleaned up properly, as calling _pci_hide_device will just change the ownership of the device, but the IOMMU context entry of the device would be left as-is. It would also leak a Domain ID, as removing the device from it's previous owner will allow releasing the DID used by the device without having cleaned up the context entry. Signed-off-by: Roger Pau Monné <roger....@citrix.com> --- RFC: I haven't tested the code path, as I have no ATS devices on the box I'm currently testing on. In any case, ATS is not supported, and removing the call to _pci_hide_device in iommu_dev_iotlb_flush_timeout should allow to remove the dependency on recursive pcidevs lock. TBD: it's unclear whether we still need the pcidevs_lock in iommu_dev_iotlb_flush_timeout. The caller of iommu_dev_iotlb_flush_timeout is already bogus as it iterates over a list of pdevs without holding the pcidevs_lock. TBD: getting rid of ATS altogether could also be an option, but it's more work. --- Cc: Oleksandr Andrushchenko <andr2...@gmail.com> --- xen/drivers/passthrough/pci.c | 11 +++++++---- xen/drivers/passthrough/vtd/qinval.c | 8 +++++++- xen/include/xen/pci.h | 3 +++ 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c index 70b6684981..4b81c1c04a 100644 --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -501,7 +501,7 @@ static void free_pdev(struct pci_seg *pseg, struct pci_dev *pdev) xfree(pdev); } -static void _pci_hide_device(struct pci_dev *pdev) +static void __init _pci_hide_device(struct pci_dev *pdev) { if ( pdev->domain ) return; @@ -1487,6 +1487,11 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag) ASSERT(pdev && (pdev->domain == hardware_domain || pdev->domain == dom_io)); + /* Do not allow broken devices to be assigned. */ + rc = -EBADF; + if ( pdev->broken ) + goto done; + rc = pdev_msix_assign(d, pdev); if ( rc ) goto done; @@ -1585,9 +1590,7 @@ void iommu_dev_iotlb_flush_timeout(struct domain *d, struct pci_dev *pdev) return; } - list_del(&pdev->domain_list); - pdev->domain = NULL; - _pci_hide_device(pdev); + pdev->broken = true; if ( !d->is_shutting_down && printk_ratelimit() ) printk(XENLOG_ERR "dom%d: ATS device %pp flush failed\n", diff --git a/xen/drivers/passthrough/vtd/qinval.c b/xen/drivers/passthrough/vtd/qinval.c index 9f291f47e5..4436c22c05 100644 --- a/xen/drivers/passthrough/vtd/qinval.c +++ b/xen/drivers/passthrough/vtd/qinval.c @@ -227,7 +227,7 @@ static int __must_check dev_invalidate_sync(struct vtd_iommu *iommu, ASSERT(iommu->qinval_maddr); rc = queue_invalidate_wait(iommu, 0, 1, 1, 1); - if ( rc == -ETIMEDOUT ) + if ( rc == -ETIMEDOUT && !pdev->broken ) { struct domain *d = rcu_lock_domain_by_id(did_to_domain_id(iommu, did)); @@ -241,6 +241,12 @@ static int __must_check dev_invalidate_sync(struct vtd_iommu *iommu, iommu_dev_iotlb_flush_timeout(d, pdev); rcu_unlock_domain(d); } + else if ( rc == -ETIMEDOUT ) + /* + * The device is already marked as broken, ignore the error in order to + * allow deassign to succeed. + */ + rc = 0; return rc; } diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h index b6d7e454f8..00b44e8487 100644 --- a/xen/include/xen/pci.h +++ b/xen/include/xen/pci.h @@ -108,6 +108,9 @@ struct pci_dev { /* Device with errata, ignore the BARs. */ bool ignore_bars; + /* Device misbehaving, prevent assigning it. */ + bool broken; + enum pdev_type { DEV_TYPE_PCI_UNKNOWN, DEV_TYPE_PCIe_ENDPOINT, -- 2.34.1