On 2025-11-03 18:25, Chen, Xiaogang wrote:
On 11/3/2025 4:21 PM, Harish Kasiviswanathan wrote:
Fix the following corner case:-
Consider a 2M huge page SVM allocation, followed by prefetch call for
the first 4K page. The whole range is initially mapped with single PTE.
After the prefetch, this range gets split to first page + rest of the
pages. Currently, the first page mapping is not updated on MI300A (APU)
since page hasn't migrated. However, after range split PTE mapping it not
valid.
Fix this by forcing page table update for the whole range when prefetch
is called. Calling prefetch on APU doesn't improve performance. If all
it deteriotes. However, functionality has to be supported.
v2: Use apu_prefer_gtt as this issue doesn't apply to APUs with carveout
VRAM
v3: Simplify by setting the flag for all ASICs as it doesn't affect dGPU
Suggested-by: Philip Yang<[email protected]>
Signed-off-by: Harish Kasiviswanathan<[email protected]>
Reviewed-by: Philip Yang<[email protected]>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index c30dfb8ec236..26eac89c90a8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -768,6 +768,9 @@ svm_range_apply_attrs(struct kfd_process *p, struct
svm_range *prange,
int gpuidx;
for (i = 0; i < nattr; i++) {
+ if (!p->xnack_enabled)
+ *update_mapping = true;
If you want always set update_mapping, set it outside loop, not need
set it during each attribution check.
And I think the discussion is setting update_mapping when there is
split for prange. If change existing prange attributions without
split, not need update vm for
KFD_IOCTL_SVM_ATTR_PREFERRED_LOC/KFD_IOCTL_SVM_ATTR_PREFETCH_LOC.
Maybe change like this
@@ -3800,6 +3800,9 @@ svm_range_set_attr(struct kfd_process *p, struct
mm_struct *mm,
svm_range_apply_attrs(p, prange, nattr, attrs,
&update_mapping);
/* TODO: unmap ranges from GPU that lost access */
}
+
+ update_mapping |= !p->xnack_enabled && !list_empty(&remap_list);
+
list_for_each_entry_safe(prange, next, &remove_list, update_list) {
pr_debug("unlink old 0x%p prange 0x%p [0x%lx 0x%lx]\n",
prange->svms, prange, prange->start,
Regards,
Philip
Regards
Xiaogang
+
switch (attrs[i].type) {
case KFD_IOCTL_SVM_ATTR_PREFERRED_LOC:
prange->preferred_loc = attrs[i].value;
@@ -778,8 +781,6 @@ svm_range_apply_attrs(struct kfd_process *p, struct
svm_range *prange,
case KFD_IOCTL_SVM_ATTR_ACCESS:
case KFD_IOCTL_SVM_ATTR_ACCESS_IN_PLACE:
case KFD_IOCTL_SVM_ATTR_NO_ACCESS:
- if (!p->xnack_enabled)
- *update_mapping = true;
gpuidx = kfd_process_gpuidx_from_gpuid(p,
attrs[i].value);