On Mon, 2015-10-05 at 07:20 +0000, Bhushan Bharat wrote: > > > > -----Original Message----- > > From: Alex Williamson [mailto:alex.william...@redhat.com] > > Sent: Saturday, October 03, 2015 4:17 AM > > To: Bhushan Bharat-R65777 <bharat.bhus...@freescale.com> > > Cc: kvm...@lists.cs.columbia.edu; kvm@vger.kernel.org; > > christoffer.d...@linaro.org; eric.au...@linaro.org; pranavku...@linaro.org; > > marc.zyng...@arm.com; will.dea...@arm.com > > Subject: Re: [RFC PATCH 5/6] vfio-pci: Create iommu mapping for msi > > interrupt > > > > On Wed, 2015-09-30 at 20:26 +0530, Bharat Bhushan wrote: > > > An MSI-address is allocated and programmed in pcie device during > > > interrupt configuration. Now for a pass-through device, try to create > > > the iommu mapping for this allocted/programmed msi-address. If the > > > iommu mapping is created and the msi address programmed in the pcie > > > device is different from msi-iova as per iommu programming then > > > reconfigure the pci device to use msi-iova as msi address. > > > > > > Signed-off-by: Bharat Bhushan <bharat.bhus...@freescale.com> > > > --- > > > drivers/vfio/pci/vfio_pci_intrs.c | 36 > > > ++++++++++++++++++++++++++++++++++-- > > > 1 file changed, 34 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/vfio/pci/vfio_pci_intrs.c > > > b/drivers/vfio/pci/vfio_pci_intrs.c > > > index 1f577b4..c9690af 100644 > > > --- a/drivers/vfio/pci/vfio_pci_intrs.c > > > +++ b/drivers/vfio/pci/vfio_pci_intrs.c > > > @@ -312,13 +312,23 @@ static int vfio_msi_set_vector_signal(struct > > vfio_pci_device *vdev, > > > int irq = msix ? vdev->msix[vector].vector : pdev->irq + vector; > > > char *name = msix ? "vfio-msix" : "vfio-msi"; > > > struct eventfd_ctx *trigger; > > > + struct msi_msg msg; > > > + struct vfio_device *device; > > > + uint64_t msi_addr, msi_iova; > > > int ret; > > > > > > if (vector >= vdev->num_ctx) > > > return -EINVAL; > > > > > > + device = vfio_device_get_from_dev(&pdev->dev); > > > > Have you looked at this function? I don't think we want to be doing that > > every time we want to poke the interrupt configuration. > > I am trying to describe what I understood, a device can have many interrupts > and we should setup iommu only once, when called for the first time to > enable/setup interrupt. > Similarly when disabling the interrupt we should iommu-unmap when called for > the last enabled interrupt for that device. Now with this understanding, > should I move this map-unmap to separate functions and call them from > vfio_msi_set_block() rather than in vfio_msi_set_vector_signal()
Interrupts can be setup and torn down at any time and I don't see how one function or the other makes much difference. vfio_device_get_from_dev() is enough overhead that the data we need should be cached if we're going to call it with some regularity. Maybe vfio_iommu_driver_ops.open() should be called with a pointer to the vfio_device... or the vfio_group. > > Also note that > > IOMMU mappings don't operate on devices, but groups, so maybe we want > > to pass the group. > > Yes, it operates on group. I hesitated to add an API to get group. Do you > suggest to that it is ok to add API to get group from device. No, the above suggestion is probably better. > > > > > + if (device == NULL) > > > + return -EINVAL; > > > > This would be a legitimate BUG_ON(!device) > > > > > + > > > if (vdev->ctx[vector].trigger) { > > > free_irq(irq, vdev->ctx[vector].trigger); > > > + get_cached_msi_msg(irq, &msg); > > > + msi_iova = ((u64)msg.address_hi << 32) | msg.address_lo; > > > + vfio_device_unmap_msi(device, msi_iova, PAGE_SIZE); > > > kfree(vdev->ctx[vector].name); > > > eventfd_ctx_put(vdev->ctx[vector].trigger); > > > vdev->ctx[vector].trigger = NULL; > > > @@ -346,12 +356,11 @@ static int vfio_msi_set_vector_signal(struct > > vfio_pci_device *vdev, > > > * cached value of the message prior to enabling. > > > */ > > > if (msix) { > > > - struct msi_msg msg; > > > - > > > get_cached_msi_msg(irq, &msg); > > > pci_write_msi_msg(irq, &msg); > > > } > > > > > > + > > > > gratuitous newline > > > > > ret = request_irq(irq, vfio_msihandler, 0, > > > vdev->ctx[vector].name, trigger); > > > if (ret) { > > > @@ -360,6 +369,29 @@ static int vfio_msi_set_vector_signal(struct > > vfio_pci_device *vdev, > > > return ret; > > > } > > > > > > + /* Re-program the new-iova in pci-device in case there is > > > + * different iommu-mapping created for programmed msi-address. > > > + */ > > > + get_cached_msi_msg(irq, &msg); > > > + msi_iova = 0; > > > + msi_addr = (u64)(msg.address_hi) << 32 | (u64)(msg.address_lo); > > > + ret = vfio_device_map_msi(device, msi_addr, PAGE_SIZE, > > &msi_iova); > > > + if (ret) { > > > + free_irq(irq, vdev->ctx[vector].trigger); > > > + kfree(vdev->ctx[vector].name); > > > + eventfd_ctx_put(trigger); > > > + return ret; > > > + } > > > + > > > + /* Reprogram only if iommu-mapped iova is different from msi- > > address */ > > > + if (msi_iova && (msi_iova != msi_addr)) { > > > + msg.address_hi = (u32)(msi_iova >> 32); > > > + /* Keep Lower bits from original msi message address */ > > > + msg.address_lo &= PAGE_MASK; > > > + msg.address_lo |= (u32)(msi_iova & 0x00000000ffffffff); > > > > Seems like you're making some assumptions here that are dependent on the > > architecture and maybe the platform. > > What I tried is to map the msi page with different iova, which is page size > aligned. But the offset within the page will remain same. > For example, original msi address was 0x0603_0040 and we have a reserved > region at 0xf000_0000. So iommu mapping is created for 0xf000_0000 > =>0x0600_3000 of size 0x1000. > > So the new address to be programmed in device is 0xf000_0040, offset 0x40 > added to base address in iommu mapping. Don't you need ~PAGE_MASK for it to work like that? The & with 0x00000000ffffffff shouldn't be needed either, certainly not with all the leading zeros. > > > + pci_write_msi_msg(irq, &msg); > > > + } > > > + > > > vdev->ctx[vector].trigger = trigger; > > > > > > return 0; > > > > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html