On 4/23/25 11:50, Shane Xiao wrote:
> If applications unmap the memory before destroying the userptr, it needs
> trigger a segfault to notify user space to correct the free sequence in
> VM debug mode.
>
> Signed-off-by: Shane Xiao <shane.x...@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index d2ec4130a316..259b38424b7f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -2559,6 +2559,16 @@ static int update_invalid_user_pages(struct
> amdkfd_process_info *process_info,
> if (ret != -EFAULT)
> return ret;
>
> + /* If applications unmaps memory before destroying the
> userptr
> + * from the KFD, trigger a segmentation fault in VM
> debug mode.
> + */
> + if (amdgpu_ttm_adev(bo->tbo.bdev)->debug_vm) {
Using debug_vm works for now, but maybe we should have a separate debug flag
for this.
> + amdgpu_ttm_tt_get_userptr(&bo->tbo, userptr);
> + pr_err("User space unmap memory before
> destroying a userptr that refers to it\n");
> + pr_err("The unmap userptr address is 0x%llx\n",
> userptr);
> + send_sig(SIGSEGV,
> get_pid_task(process_info->pid, PIDTYPE_PID), 0);
Drivers should *never* mess with send_sig() directly. We made the mistake to
allow that with the KFD already.
We should rather send this as GPU access fault or something like that.
Regards,
Christian.
> + }
> +
> ret = 0;
> }
>