Hi Julien,
On 5/8/2024 5:54 AM, Julien Grall wrote:
Hi Henry,
What if the DT overlay is unloaded and then reloaded? Wouldn't the
same interrupt be re-used? As a more generic case, this could also
be a new bitstream for the FPGA.
But even if the interrupt is brand new every time for the DT
overlay, you are effectively relaxing the check for every user (such
as XEN_DOMCTL_bind_pt_irq). So the interrupt re-use case needs to be
taken into account.
I agree. I think IIUC, with your explanation here and below, could we
simplify the problem to how to properly handle the removal of the IRQ
from a running guest, if we always properly remove and clean up the
information when remove the IRQ from the guest? In this way, the IRQ
can always be viewed as a brand new one when we add it back.
If we can make sure the virtual IRQ and physical IRQ is cleaned then yes.
Then the only corner case that we need to take care of would be...
Can you clarify whether you say the "only corner case" because you
looked at the code? Or is it just because I mentioned only one?
Well, I indeed checked the code and to my best knowledge the corner case
that you pointed out would be the only one I can think of.
Xen allows the guest to enable a vIRQ even if there is no pIRQ
assigned. Thanksfully, it looks like the vgic_connect_hw_irq(), in
both the current and new vGIC, will return an error if we are trying
to route a pIRQ to an already enabled vIRQ.
But we need to investigate all the possible scenarios to make sure
that any inconsistencies between the physical state and virtual
state (including the LRs) will not result to bigger problem.
The one that comes to my mind is: The physical interrupt is
de-assigned from the guest before it was EOIed. In this case, the
interrupt will still be in the LR with the HW bit set. This would
allow the guest to EOI the interrupt even if it is routed to someone
else. It is unclear what would be the impact on the other guest.
...same as this case, i.e.
test_bit(_IRQ_INPROGRESS, &desc->status) || !test_bit(_IRQ_DISABLED,
&desc->status)) when we try to remove the IRQ from a running domain.
We already call ->shutdown() which will disable the IRQ. So don't we
only need to take care of _IRQ_INPROGRESS?
Yes you are correct.
we have 3 possible states which can be read from LR for this case :
active, pending, pending and active.
- I don't think we can do anything about the active state, so we
should return -EBUSY and reject the whole operation of removing the
IRQ from running guest, and user can always retry this operation.
This would mean a malicious/buggy guest would be able to prevent a
device to be de-assigned. This is not a good idea in particular when
the domain is dying.
That said, I think you can handle this case. The LR has a bit to
indicate whether the pIRQ needs to be EOIed. You can clear it and this
would prevent the guest to touch the pIRQ. There might be other
clean-up to do in the vGIC datastructure.
I probably misunderstood this sentence, do you mean the EOI bit in the
pINTID field? I think this bit is only available when the HW bit of LR
is 0, but in our case the HW is supposed to be 1 (as indicated as your
previous comment). Would you mind clarifying a bit more? Thanks!
Anyway, we don't have to handle removing an active IRQ when the domain
is still running (although we do when the domain is destroying). But I
think this would need to be solved before the feature is (security)
supported.
- For the pending (and active) case,
Shouldn't the pending and active case handled the same way as the
active case?
Sorry, yes you are correct.
Kind regards,
Henry