[Public] The big thing is that HMM doesn’t support a lack of read permissions. So does it make sense to make any HMM calls or set any HMM flags if it’s an unsupported configuration that it will fault on? That was why I went down this path, since trying to remove the READ permissions here while still using HMM calls could go sideways, if HMM will automatically fail if READ permissions are missing.
Kent From: Chen, Xiaogang <xiaogang.c...@amd.com> Sent: Tuesday, July 22, 2025 6:46 PM To: Russell, Kent <kent.russ...@amd.com>; amd-gfx@lists.freedesktop.org Cc: Kuehling, Felix <felix.kuehl...@amd.com> Subject: Re: [PATCH] drm/amdkfd: Handle lack of READ permissions in SVM mapping On 7/22/2025 11:24 AM, Kent Russell wrote: HMM assumes that pages have READ permissions by default. Inside svm_range_validate_and_map, we add READ permissions then add WRITE permissions if the VMA isn't read-only. This will conflict with regions that only have PROT_WRITE or have PROT_NONE. When that happens, Why read-only conflict with PROT_WRITE or have PROT_NONE? They are vma->vm_flags that specifies the vma protection. User can change its value at runtime. Is user not allowed to change it from read-only to PROT_NONE? svm_range_validate_and_map will continue to retry, silently, giving the impression of a hang. If pages don't have READ permissions, simply unmap them and continue. If they weren't mapped in the first place, this would be a no-op. Since x86 doesn't support write-only, and PROT_NONE doesn't allow reads or writes anyways, this will allow the svm range validation to continue without getting stuck in a loop forever on mappings we can't use with HMM. Signed-off-by: Kent Russell <kent.russ...@amd.com><mailto:kent.russ...@amd.com> --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index e23b5a0f31f2..10b70b941b11 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -1713,6 +1713,24 @@ static int svm_range_validate_and_map(struct mm_struct *mm, next = min(vma->vm_end, end); npages = (next - addr) >> PAGE_SHIFT; + /* HMM requires at least READ permissions. If provided with PROT_NONE, + * unmap the memory. If it's not already mapped, this is a no-op + * If PROT_WRITE is provided without READ, warn first then unmap + */ + if (!(vma->vm_flags & VM_READ)) { + unsigned long e, s; + + if (vma->vm_flags & VM_WRITE) + pr_warn("VM_WRITE without VM_READ is not supported"); + s = max(start, prange->start); + e = min(end, prange->last); + if (e >= s) + svm_range_unmap_from_gpus(prange, s, e, + KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU); + addr = next; + continue; + } + WRITE_ONCE(p->svms.faulting_task, current); r = amdgpu_hmm_range_get_pages(&prange->notifier, addr, npages, readonly, owner, NULL, It seems the real problem is at amdgpu_hmm_range_get_pages. It always set HMM_PFN_REQ_FAULT to hmm_range->default_flags. HMM_PFN_REQ_FAULT means the page is faultable and a future call with HMM_PFN_REQ_FAULT could succeed. When vma->vm_flags is PROT_NONE the vma is not faultable, so hmm_range->default_flags should be not set to HMM_PFN_REQ_FAULT to avoid hmm_range_fault fault this vma. Regards Xiaogang