On 11/2/2016 5:51 PM, Jike Song wrote: > On 11/02/2016 12:09 PM, Alexey Kardashevskiy wrote: >> On 02/11/16 14:29, Kirti Wankhede wrote: >>> >>> >>> On 11/2/2016 6:54 AM, Alexey Kardashevskiy wrote: >>>> On 02/11/16 01:01, Kirti Wankhede wrote: >>>>> >>>>> >>>>> On 10/28/2016 7:48 AM, Alexey Kardashevskiy wrote: >>>>>> On 27/10/16 23:31, Kirti Wankhede wrote: >>>>>>> >>>>>>> >>>>>>> On 10/27/2016 12:50 PM, Alexey Kardashevskiy wrote: >>>>>>>> On 18/10/16 08:22, Kirti Wankhede wrote: >>>>>>>>> VFIO IOMMU drivers are designed for the devices which are IOMMU >>>>>>>>> capable. >>>>>>>>> Mediated device only uses IOMMU APIs, the underlying hardware can be >>>>>>>>> managed by an IOMMU domain. >>>>>>>>> >>>>>>>>> Aim of this change is: >>>>>>>>> - To use most of the code of TYPE1 IOMMU driver for mediated devices >>>>>>>>> - To support direct assigned device and mediated device in single >>>>>>>>> module >>>>>>>>> >>>>>>>>> Added two new callback functions to struct vfio_iommu_driver_ops. >>>>>>>>> Backend >>>>>>>>> IOMMU module that supports pining and unpinning pages for mdev devices >>>>>>>>> should provide these functions. >>>>>>>>> Added APIs for pining and unpining pages to VFIO module. These calls >>>>>>>>> back >>>>>>>>> into backend iommu module to actually pin and unpin pages. >>>>>>>>> >>>>>>>>> This change adds pin and unpin support for mediated device to TYPE1 >>>>>>>>> IOMMU >>>>>>>>> backend module. More details: >>>>>>>>> - When iommu_group of mediated devices is attached, task structure is >>>>>>>>> cached which is used later to pin pages and page accounting. >>>>>>>> >>>>>>>> >>>>>>>> For SPAPR TCE IOMMU driver, I ended up caching mm_struct with >>>>>>>> atomic_inc(&container->mm->mm_count) (patches are on the way) instead >>>>>>>> of >>>>>>>> using @current or task as the process might be gone while VFIO >>>>>>>> container is >>>>>>>> still alive and @mm might be needed to do proper cleanup; this might >>>>>>>> not be >>>>>>>> an issue with this patchset now but still you seem to only use @mm from >>>>>>>> task_struct. >>>>>>>> >>>>>>> >>>>>>> Consider the example of QEMU process which creates VFIO container, QEMU >>>>>>> in its teardown path would release the container. How could container be >>>>>>> alive when process is gone? >>>>>> >>>>>> do_exit() in kernel/exit.c calls exit_mm() (which sets NULL to tsk->mm) >>>>>> first, and then releases open files by calling exit_files(). So >>>>>> container's release() does not have current->mm. >>>>>> >>>>> >>>>> Incrementing usage count (get_task_struct()) while saving task structure >>>>> and decementing it (put_task_struct()) from release() should work here. >>>>> Updating the patch. >>>> >>>> I cannot see how the task->usage counter prevents do_exit() from performing >>>> the exit, can you? >>>> >>> >>> It will not prevent exit from do_exit(), but that will make sure that we >>> don't have stale pointer of task structure. Then we can check whether >>> the task is alive and get mm pointer in teardown path as below: >> >> >> Or you could just reference and use @mm as KVM and others do. Or there is >> anything else you need from @current than just @mm? >> > > I agree. If @mm is the only thing needed, there is really no reason to > refer to the @task :-) >
In vfio_lock_acct(), that is for page accounting, if mm->mmap_sem is already held then page accounting is deferred, where task structure is used to get mm and work is deferred only if mm exist: mm = get_task_mm(task); That is where this module need task structure. Thanks, Kirti