On Wed, Apr 29, 2015 at 07:51:21PM +1000, Alexey Kardashevskiy wrote: > On 04/29/2015 02:18 PM, David Gibson wrote: > >On Sat, Apr 25, 2015 at 10:14:42PM +1000, Alexey Kardashevskiy wrote: > >>At the moment writing new TCE value to the IOMMU table fails with EBUSY > >>if there is a valid entry already. However PAPR specification allows > >>the guest to write new TCE value without clearing it first. > >> > >>Another problem this patch is addressing is the use of pool locks for > >>external IOMMU users such as VFIO. The pool locks are to protect > >>DMA page allocator rather than entries and since the host kernel does > >>not control what pages are in use, there is no point in pool locks and > >>exchange()+put_page(oldtce) is sufficient to avoid possible races. > >> > >>This adds an exchange() callback to iommu_table_ops which does the same > >>thing as set() plus it returns replaced TCE and DMA direction so > >>the caller can release the pages afterwards. The exchange() receives > >>a physical address unlike set() which receives linear mapping address; > >>and returns a physical address as the clear() does. > >> > >>This implements exchange() for P5IOC2/IODA/IODA2. This adds a requirement > >>for a platform to have exchange() implemented in order to support VFIO. > >> > >>This replaces iommu_tce_build() and iommu_clear_tce() with > >>a single iommu_tce_xchg(). > >> > >>This makes sure that TCE permission bits are not set in TCE passed to > >>IOMMU API as those are to be calculated by platform code from DMA direction. > >> > >>This moves SetPageDirty() to the IOMMU code to make it work for both > >>VFIO ioctl interface in in-kernel TCE acceleration (when it becomes > >>available later). > >> > >>Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > >>[aw: for the vfio related changes] > >>Acked-by: Alex Williamson <alex.william...@redhat.com> > > > >This looks mostly good, but there are couple of details that need fixing. > > > > > [...] > > >>diff --git a/arch/powerpc/platforms/powernv/pci.c > >>b/arch/powerpc/platforms/powernv/pci.c > >>index ba75aa5..e8802ac 100644 > >>--- a/arch/powerpc/platforms/powernv/pci.c > >>+++ b/arch/powerpc/platforms/powernv/pci.c > >>@@ -598,6 +598,23 @@ int pnv_tce_build(struct iommu_table *tbl, long index, > >>long npages, > >> return 0; > >> } > >> > >>+#ifdef CONFIG_IOMMU_API > >>+int pnv_tce_xchg(struct iommu_table *tbl, long index, > >>+ unsigned long *tce, enum dma_data_direction *direction) > >>+{ > >>+ u64 proto_tce = iommu_direction_to_tce_perm(*direction); > >>+ unsigned long newtce = *tce | proto_tce; > >>+ unsigned long idx = index - tbl->it_offset; > > > >Should this have a BUG_ON or WARN_ON if the supplied tce has bits set > >below the page mask? > > > Why? The caller checks these bits, do we really need to duplicate it > here?
Because this is the crunch point where bad bits will actually cause strange stuff to happen. As much as anything the point of a BUG_ON would be to document that this function expects the parameter to be aligned. > > > >>+ *tce = xchg(pnv_tce(tbl, idx), cpu_to_be64(newtce)); > >>+ *tce = be64_to_cpu(*tce); > >>+ *direction = iommu_tce_direction(*tce); > >>+ *tce &= ~(TCE_PCI_READ | TCE_PCI_WRITE); > >>+ > >>+ return 0; > >>+} > >>+#endif > > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
pgpguCMEm1obR.pgp
Description: PGP signature
_______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev