On Thu, Jul 14, 2016 at 3:16 PM, Alexey Kardashevskiy <a...@ozlabs.ru> wrote: > At the moment VFIO IOMMU SPAPR v2 driver pins all guest RAM pages when > the userspace starts using VFIO. When the userspace process finishes, > all the pinned pages need to be put; this is done as a part of > the userspace memory context (MM) destruction which happens on > the very last mmdrop(). > > This approach has a problem that a MM of the userspace process > may live longer than the userspace process itself as kernel threads > usually execute on a MM of a userspace process which was runnning > on a CPU where the kernel thread was scheduled to. If this happened, > the MM remains referenced until this exact kernel thread wakes up again > and releases the very last reference to the MM, on an idle system this > can take even hours. > > This fixes the issue by moving mm_iommu_cleanup() (the helper which puts > pages) from destroy_context() (called on the last mmdrop()) to > the arch-specific arch_exit_mmap() hook (called on the last mmput()). > mmdrop() decrements the mm->mm_count which is a total reference number; > mmput() decrements the mm->mm_users which is a number of user spaces and > this is actually the counter we want to watch for here. > > Cc: David Gibson <da...@gibson.dropbear.id.au> > Cc: Benjamin Herrenschmidt <b...@kernel.crashing.org> > Cc: Paul Mackerras <pau...@samba.org> > Cc: Balbir Singh <bsinghar...@gmail.com> > Cc: Nick Piggin <npig...@kernel.dk> > Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > --- > arch/powerpc/include/asm/mmu_context.h | 3 +++ > arch/powerpc/mm/mmu_context_book3s64.c | 4 ---- > 2 files changed, 3 insertions(+), 4 deletions(-) > > diff --git a/arch/powerpc/include/asm/mmu_context.h > b/arch/powerpc/include/asm/mmu_context.h > index 9d2cd0c..24b590d 100644 > --- a/arch/powerpc/include/asm/mmu_context.h > +++ b/arch/powerpc/include/asm/mmu_context.h > @@ -138,6 +138,9 @@ static inline void arch_dup_mmap(struct mm_struct *oldmm, > > static inline void arch_exit_mmap(struct mm_struct *mm) > { > +#ifdef CONFIG_SPAPR_TCE_IOMMU > + mm_iommu_cleanup(&mm->context); > +#endif > } >
The assumption is that all necessary tear down has happened. Earlier we were doing the teardown on the last mm ref drop, do we have any dependence on that? I don't think so, but just checking Does qemu dotce_iommu_register_pages on exit()/atexit() or do we rely on the exit path to do the teardown? Can we please remove the #ifdef CONFIG_SPAPR_TCE_IOMMU here and have a version of mm_iommu_cleanup that handles it Basically do #ifdef CONFIG_SPAPR_TCE_IOMMU extern void mm_iommu_cleanup(void *ptr) #else static inline mm_iommu_cleanup(void *ptr) {} #endif Otherwise looks good Acked-by: Balbir Singh <bsinghar...@gmail.com> _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev