On 06/07/17 22:11, Auger Eric wrote: > Hello Bharat, Jean-Philippe, > On 06/07/2017 12:02, Jean-Philippe Brucker wrote: >> On 05/07/17 09:49, Bharat Bhushan wrote:>>> Also when setup msi-route >> kvm_irqchip_add_msi_route() we needed to >>>> provide the translated address. >>>>> According to my understanding this is required because kernel does no go >>>> through viommu translation when generating interrupt, no? >>>> >>>> yes this is needed when KVM MSI routes are set up, ie. along with GICV3 >>>> ITS. >>>> With GICv2M, qemu direct gsi mapping is used and this is not needed. >>>> >>>> So I do not understand your previous sentence saying "MSI interrupts works >>>> without any change". >>> >>> I have almost completed vfio integration with virtio-iommu and now testing >>> the changes by assigning e1000 device to VM. For this I have changed >>> virtio-iommu driver to use IOMMU_RESV_MSI rather than sw-msi and this does >>> not need changed in vfio_get_addr() and kvm_irqchip_add_msi_route() >> >> I understand you're reserving region 0x08000000-0x08100000 as >> IOMMU_RESV_MSI instead of IOMMU_RESV_SW_MSI? I think this only works >> because Qemu places the vgic in that area as well (in hw/arm/virt.c). It's >> not a coincidence if the addresses are the same, because Eric chose them >> for the Linux SMMU drivers and I copied them. > > Yes I chose this region because it does not overlap with any guest RAM > region > >> >> We can't rely on that behavior, though, it will break MSIs in emulated >> devices. And if Qemu happens to move the MSI doorbell in future machine >> revisions, then it would also break VFIO. >> >> Just for my own understanding -- what happens, I think, is that in Linux >> iova_reserve_iommu_regions initially reserves the guest-physical doorbell >> 0x08000000-0x08100000. Then much later, when the device driver requests an >> MSI, the irqchip driver calls iommu_dma_map_msi_msg with the >> guest-physical gicv2m address 0x08020000. The function finds the right >> page in msi_page_list, which was added by cookie_init_hw_msi_region, >> therefore bypassing the viommu and the GPA gets written in the MSI-X table. > > I share Jean's understanding. To me using IOMMU_RESV_MSI in the > virtio-iommu means this region is not translated by the IOMMU. as > cookie_init_hw_msi_region() pre-allocates the msi_page array, > iommu_dma_get_msi_page() does not do any IOMMU mapping. > >> >> If an emulated device such as virtio-net-pci were to generate an MSI, then >> Qemu would attempt to access the doorbell written by Linux into the MSI-X >> table, 0x08020000, and fault because that address wasn't mapped in the >> viommu. > Yes so I am confused, how can it work with a virtio-net-pci or > passthrough'ed e1000e device using MSIs? >> >> So for VFIO, you either need to translate the MSI-X entry using the >> viommu, > > For the vsmmuv3 I created a dedicated IOMMUNotifier to handle the fact > the MSI doorbell is translated and MSI routes need to be updated. This > seems to work. > > or just assume that the vaddr corresponds to the only MSI doorbell >> accessible by this device (because how can we be certain that the guest >> already mapped the doorbell before writing the entry?) >> >> For ARM machines it's probably best to stick with IOMMU_RESV_SW_MSI. >> However, a nice way to use IOMMU_RESV_MSI would be for the virtio-iommu >> device to advertise identity-mapped/reserved regions, and bypass >> translation on these regions. Then the driver could reserve those with >> IOMMU_RESV_MSI. > > At least we may need to configure the virtio-iommu to either bypass MSIs > (x86) or translate MSIs (ARM)?
Yes, see the VIRTIO_IOMMU_T_PROBE proposal in, er, my other reply. > For x86 we will need such a system, with an added IRQ >> remapping feature. > Meaning this must live along with vIR, is that what you mean? Also on > ARM this must live with vITS anyway. This is an orthogonal feature, right? Reserving doorbells regions on x86 is a must otherwise MSIs won't work. IRQ remapping would be nice to add in some distant future. Thanks, Jean