Really thanks for your detailed reply, it really helps a lot!
Regards,
Honglei
On 2026/2/2 17:49, Thomas Hellström wrote:
On Mon, 2026-02-02 at 15:56 +0800, Honglei Huang wrote:
Hi Matthew and Thomas,
I'm exploring the use of drm_gpusvm for multi-GPU shared virtual
memory
scenarios and have some questions about potential synchronization
issues.
The drm_gpusvm design is per-device oriented, so for multi-GPU
setups,
each GPU would have its own drm_gpusvm instance with independent MMU
notifiers registered to the same mm_struct.
When multiple drm_gpusvm instances share the same process address
space,
I'm concerned about the following synchronization issues:
1. MMU notifier ordering: When CPU modifies memory (e.g., munmap),
multiple notifier callbacks are triggered independently. Is there
any
guarantee on the ordering or atomicity across GPUs? Could this
lead
to inconsistent states between GPUs?
The guarantee is that the invalidation may not proceed until all mmu
notifiers have completed, and then gpusvm_range_get_pages() will never
complete successfully until the invalidation is complete.
2. Range state consistency: If GPU-A and GPU-B both have ranges
covering the same virtual address, and an invalidation occurs,
how
should we ensure both GPUs see a consistent view before allowing
new GPU accesses?
Multiple gpus may maintain ranges of different size with different
attributes pointing to the same memory, and that's really not a
problem. It's up to user-space to ensure that we're not bouncing data
around between gpus. In xe, we're using the gpu_madvise() ioctl to
allow UMD to specify things like preferred region and access mode.
3. Concurrent fault handling: If GPU-A and GPU-B fault on the same
address simultaneously, is there potential for races in
drm_gpusvm_range_find_or_insert()?
each drm_gpusvm() instance is meant to be per-vm and per-device so each
vm on each gpu only sees its own ranges. The drm_pagemap code is then
the code that maintains the migration state, and that is per-cpu-vm so
if it is fed conflicting migration requests from different gpus or even
different vms, it will try its best to mitigate.
However, the invalidation scheme in 1. will always guarantee that all
gpus either have invalid page-tables causing gpu-faults or point to
common memory that holds the data.
Is multi-GPU a considered use case for drm_gpusvm? If so, are there
recommended patterns for handling these coordination issues?
For us it's considered a valid use-case. In the end I guess that
depends on the API you are exposing to the end-user. KMD ensures all
GPUs always point to the correct data for a given cpu virtual address
space, but leaves it to user-space to supply non-conflicting migration
requests to avoid excessive migration. I'm under the impression that
our L0 user-facing API is also forwarding this responsibility to the
end-user.
Hope this information helps.
Thanks,
Thomas
Regards,
Honglei