On 2016-11-30 11:23 AM, Jason Gunthorpe wrote: >> Yes, that sounds fine. Can we simply kill the process from the GPU driver? >> Or do we need to extend the OOM killer to manage GPU pages? > I don't know.. We could use send_sig_info to send signal from kernel to user space. So theoretically GPU driver could issue KILL signal to some process.
> On Wed, Nov 30, 2016 at 12:45:58PM +0200, Haggai Eran wrote: >> I think we can achieve the kernel's needs with ZONE_DEVICE and DMA-API >> support >> for peer to peer. I'm not sure we need vmap. We need a way to have a >> scatterlist >> of MMIO pfns, and ZONE_DEVICE allows that. I do not think that using DMA-API as it is is the best solution (at least in the current form): - It deals with handles/fd for the whole allocation but client could/will use sub-allocation as well as theoretically possible to "merge" several allocations in one from GPU perspective. - It require knowledge to export but because "sharing" is controlled from user space it means that we must "export" all allocation by default - It deals with 'fd'/handles but user application may work with addresses/pointers. Also current DMA-API force each time to do all DMA table programming unrelated if location was changed or not. With vma / mmu we are able to install notifier to intercept changes in location and update translation tables only as needed (we do not need to keep get_user_pages() lock).