On 08.02.22 13:00, Jan Beulich wrote:
> On 08.02.2022 11:52, Oleksandr Andrushchenko wrote:
>> This smells like we first need to fix the existing code, so
>> pdev->domain is not assigned by specific IOMMU implementations,
>> but instead controlled by the code which relies on that, assign_device.
> Feel free to come up with proposals how to cleanly do so. Moving the
> assignment to pdev->domain may even be possible now, but if you go
> back you may find that the code was quite different earlier on.
I do understand that as the code evolves new use cases bring
new issues.
>
>> I can have something like:
>>
>> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
>> index 88836aab6baf..cc7790709a50 100644
>> --- a/xen/drivers/passthrough/pci.c
>> +++ b/xen/drivers/passthrough/pci.c
>> @@ -1475,6 +1475,7 @@ static int device_assigned(u16 seg, u8 bus, u8 devfn)
>> static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32
>> flag)
>> {
>> const struct domain_iommu *hd = dom_iommu(d);
>> + struct domain *old_owner;
>> struct pci_dev *pdev;
>> int rc = 0;
>>
>> @@ -1490,6 +1491,9 @@ static int assign_device(struct domain *d, u16 seg, u8
>> bus, u8 devfn, u32 flag)
>> ASSERT(pdev && (pdev->domain == hardware_domain ||
>> pdev->domain == dom_io));
>>
>> + /* We need to restore the old owner in case of an error. */
>> + old_owner = pdev->domain;
>> +
>> vpci_deassign_device(pdev->domain, pdev);
>>
>> rc = pdev_msix_assign(d, pdev);
>> @@ -1515,8 +1519,12 @@ static int assign_device(struct domain *d, u16 seg,
>> u8 bus, u8 devfn, u32 flag)
>>
>> done:
>> if ( rc )
>> + {
>> printk(XENLOG_G_WARNING "%pd: assign (%pp) failed (%d)\n",
>> d, &PCI_SBDF3(seg, bus, devfn), rc);
>> + /* We failed to assign, so restore the previous owner. */
>> + pdev->domain = old_owner;
>> + }
>> /* The device is assigned to dom_io so mark it as quarantined */
>> else if ( d == dom_io )
>> pdev->quarantine = true;
>>
>> But I do not think this belongs to this patch
> Indeed. Plus I'm sure you understand that it's not that simple. Assigning
> to pdev->domain is only the last step of assignment. Restoring the original
> owner would entail putting in place the original IOMMU table entries as
> well, which in turn can fail. Hence why you'll find a number of uses of
> domain_crash() in places where rolling back is far from easy.
So, why don't we just rely on the toolstack to do the roll back then?
This way we won't add new domain_crash() calls.
I do understand though that we may live Xen in a wrong state though.
So, do you think it is possible if we just call deassign_device from
assign_device on the error path? This is just like I do in vpci_assign_device:
I call vpci_deassign_device if the former fails.
> Jan
>
Thank you,
Oleksandr