On 24.02.2022 15:53, Roger Pau Monné wrote:
> On Thu, Feb 24, 2022 at 01:58:31PM +0100, Jan Beulich wrote:
>> On 24.02.2022 13:43, Roger Pau Monne wrote:
>>> TBD: it's unclear whether we still need the pcidevs_lock in
>>> iommu_dev_iotlb_flush_timeout. The caller of
>>> iommu_dev_iotlb_flush_timeout is already bogus as it iterates over a
>>> list of pdevs without holding the pcidevs_lock.
>>
>> Analysis of whether / where recursive uses are needed should imo
>> include cases where the lock ought to be held, but currently isn't
>> (like apparently this case).
> 
> Well, I'm not opposed to that. I think just aiming to get the current
> usages analyzed will already be hard.

While I fully agree, the decision to drop recursive locking would better
not put roadblocks in the way of adding locking where it is currently
missing.

>>> @@ -1487,6 +1487,11 @@ static int assign_device(struct domain *d, u16 seg, 
>>> u8 bus, u8 devfn, u32 flag)
>>>      ASSERT(pdev && (pdev->domain == hardware_domain ||
>>>                      pdev->domain == dom_io));
>>>  
>>> +    /* Do not allow broken devices to be assigned. */
>>> +    rc = -EBADF;
>>> +    if ( pdev->broken )
>>> +        goto done;
>>
>> I think this wants exceptions for Dom0 and DomIO. An admin may be
>> able to fix things in Dom0, e.g. by updating device firmware.
> 
> Doesn't a device get assigned to DomIO (or Dom0 if not using quarantine
> mode), and then when deassigned from DomIO gets assigned to Dom0?
> 
> So there's no usage of assign_device in the path that (re)assigns a
> device used by a guest into Dom0?

Well, this assumes all tool stacks behave like the xl one presently
does. Which I think is the wrong way round: Making a device assignable
should be "deassign from Dom0" (implicitly making it land in DomIO),
while removing a device from the set of assignable ones should be
"assign to Dom0" (implicitly deassigning from DomIO). That way (I
think I said so elsewhere earlier on) libxl would also avoid the need
to actually use DOMID_IO explicitly. It using DOMID_SELF like done in
various other places would seem more clen to me.

Paul - I think it was you who decided to make it work the way it
currently does. Do you have any insights into your thought process
back then which you could share?

> Or would you rather imply that pdev->broken should get cleared at some
> point (ie: when the device is assigned back to Dom0)?

I did ponder this for a while when writing the earlier reply. But I
decided against, for it being a functional change: _pci_hide_device()
currently is also sticky. But yes, in principle if a misbehaving
device _can_ be fixed (e.g. by updating its firmware), then I think
there needs to be a way to clear the "broken" flag.

Jan


Reply via email to