On 13.11.2020 12:06, Julien Grall wrote:
> Hi Jan,
> 
> On 13/11/2020 10:53, Jan Beulich wrote:
>> On 13.11.2020 11:36, Julien Grall wrote:
>>> On 13/11/2020 10:25, Jan Beulich wrote:
>>>> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
>>>>> On 11/12/20 4:46 PM, Roger Pau Monné wrote:
>>>>>> On Thu, Nov 12, 2020 at 01:16:10PM +0000, Oleksandr Andrushchenko wrote:
>>>>>>> On 11/12/20 11:40 AM, Roger Pau Monné wrote:
>>>>>>>> On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko 
>>>>>>>> wrote:
>>>>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushche...@epam.com>
>>>>>>>>> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, 
>>>>>>>>> unsigned int reg,
>>>>>>>>> +                                  void *data)
>>>>>>>>> +{
>>>>>>>>> +    struct vpci_bar *vbar, *bar = data;
>>>>>>>>> +
>>>>>>>>> +    if ( is_hardware_domain(current->domain) )
>>>>>>>>> +        return bar_read_hwdom(pdev, reg, data);
>>>>>>>>> +
>>>>>>>>> +    vbar = get_vpci_bar(current->domain, pdev, bar->index);
>>>>>>>>> +    if ( !vbar )
>>>>>>>>> +        return ~0;
>>>>>>>>> +
>>>>>>>>> +    return bar_read_guest(pdev, reg, vbar);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned 
>>>>>>>>> int reg,
>>>>>>>>> +                               uint32_t val, void *data)
>>>>>>>>> +{
>>>>>>>>> +    struct vpci_bar *bar = data;
>>>>>>>>> +
>>>>>>>>> +    if ( is_hardware_domain(current->domain) )
>>>>>>>>> +        bar_write_hwdom(pdev, reg, val, data);
>>>>>>>>> +    else
>>>>>>>>> +    {
>>>>>>>>> +        struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, 
>>>>>>>>> bar->index);
>>>>>>>>> +
>>>>>>>>> +        if ( !vbar )
>>>>>>>>> +            return;
>>>>>>>>> +        bar_write_guest(pdev, reg, val, vbar);
>>>>>>>>> +    }
>>>>>>>>> +}
>>>>>>>> You should assign different handlers based on whether the domain that
>>>>>>>> has the device assigned is a domU or the hardware domain, rather than
>>>>>>>> doing the selection here.
>>>>>>> Hm, handlers are assigned once in init_bars and this function is only 
>>>>>>> called
>>>>>>>
>>>>>>> for hwdom, so there is no way I can do that for the guests. Hence, the 
>>>>>>> dispatcher
>>>>>> I think we might want to reset the vPCI handlers when a devices gets
>>>>>> assigned and deassigned.
>>>>>
>>>>> In ARM case init_bars is called too early: PCI device assignment is 
>>>>> currently
>>>>>
>>>>> initiated by Domain-0' kernel and is done *before* PCI devices are given 
>>>>> memory
>>>>>
>>>>> ranges and BARs assigned:
>>>>>
>>>>> [    0.429514] pci_bus 0000:00: root bus resource [bus 00-ff]
>>>>> [    0.429532] pci_bus 0000:00: root bus resource [io 0x0000-0xfffff]
>>>>> [    0.429555] pci_bus 0000:00: root bus resource [mem 
>>>>> 0xfe200000-0xfe3fffff]
>>>>> [    0.429575] pci_bus 0000:00: root bus resource [mem 
>>>>> 0x30000000-0x37ffffff]
>>>>> [    0.429604] pci_bus 0000:00: root bus resource [mem 
>>>>> 0x38000000-0x3fffffff pref]
>>>>> [    0.429670] pci 0000:00:00.0: enabling Extended Tags
>>>>> [    0.453764] pci 0000:00:00.0: -------------------- 
>>>>> BUS_NOTIFY_ADD_DEVICE
>>>>>
>>>>> < init_bars >
>>>>>
>>>>> [    0.453793] pci 0000:00:00.0: -- IRQ 0
>>>>> [    0.458825] pci 0000:00:00.0: Failed to add - passthrough or MSI/MSI-X 
>>>>> might fail!
>>>>> [    0.471790] pci 0000:01:00.0: -------------------- 
>>>>> BUS_NOTIFY_ADD_DEVICE
>>>>>
>>>>> < init_bars >
>>>>>
>>>>> [    0.471821] pci 0000:01:00.0: -- IRQ 255
>>>>> [    0.476809] pci 0000:01:00.0: Failed to add - passthrough or MSI/MSI-X 
>>>>> might fail!
>>>>>
>>>>> < BAR assignments below >
>>>>>
>>>>> [    0.488233] pci 0000:00:00.0: BAR 14: assigned [mem 
>>>>> 0xfe200000-0xfe2fffff]
>>>>> [    0.488265] pci 0000:00:00.0: BAR 15: assigned [mem 
>>>>> 0x38000000-0x380fffff pref]
>>>>>
>>>>> In case of x86 this is pretty much ok as BARs are already in place, but 
>>>>> for ARM we
>>>>>
>>>>> need to take care and re-setup vPCI BARs for hwdom.
>>>>
>>>> Even on x86 there's no guarantee that all devices have their BARs set
>>>> up by firmware.
>>>>
>>>> In a subsequent reply you've suggested to move init_bars from "add" to
>>>> "assign", but I'm having trouble seeing what this would change: It's
>>>> not Dom0 controlling assignment (to itself), but Xen assigns the device
>>>> towards the end of pci_add_device().
>>>>
>>>>> Things are getting even more
>>>>>
>>>>> complicated if the host PCI bridge is not ECAM like, so you cannot set 
>>>>> mmio_handlers
>>>>>
>>>>> and trap hwdom's access to the config space to update BARs etc. This is 
>>>>> why I have that
>>>>>
>>>>> ugly hack for rcar_gen3 to read actual BARs for hwdom.
>>>>
>>>> How to config space accesses work there? The latest for MSI/MSI-X it'll
>>>> be imperative that Xen be able to intercept config space writes.
>>>
>>> I am not sure to understand your last sentence. Are you saying that we
>>> always need to trap access to MSI/MSI-X message in order to sanitize it?
>>>
>>> If one is using the GICv3 ITS (I haven't investigated other MSI
>>> controller), then I don't believe you need to sanitize the MSI/MSI-X
>>> message in most of the situation.
>>
>> Well, if it's fine for the guest to write arbitrary values to message
>> address and message data,
> 
> The message address would be the doorbell of the ITS that is usually 
> going through the IOMMU page-tables. Although, I am aware of a couple of 
> platforms where the doorbell access (among other address ranges 
> including P2P transaction) bypass the IOMMU. In this situation, we would 
> need a lot more work than just trapping the access.
When you say "The message address would be the doorbell of the ITS" am
I right in understanding that's the designated address to be put there?
What if the guest puts some random different address there?

> Regarding the message data, for the ITS this is an event ID. The HW will 
> then tag each message with the device ID (this prevents spoofing). The 
> tupple (device ID, event ID) is used by the ITS to decide where to 
> inject the event.
> 
> Whether other MSI controllers (e.g. GICv2m) have similar isolation 
> feature will be on the case by case basis.
> 
>> _and_ to arbitrarily enable/disable MSI / MSI-X,
>> then yes, no interception would be needed.
> The device would be owned by the guest, so I am not sure to understand 
> the exact problem of letting it enabling/disabling MSI/MSI-X. Do you 
> mind expanding your thoughts?

Question is - is Xen involved in any way in the handling of interrupts
from such a device? If not, then I guess full control can indeed be
left with the guest.

> Furthermore, you can also control which event is enabled/disabled at the 
> ITS level.

And that's something Xen controls? On x86 we don't have a 2nd level
of controls, so we need to merge Xen's and the guest's intentions in
software to know what to store in hardware.

Jan

Reply via email to