On 3/26/25 10:21, Bert Karwatzki wrote: > Am Mittwoch, dem 26.03.2025 um 09:45 +1100 schrieb Balbir Singh: >> >> >> The second region seems to be additional, I suspect that is HMM mapping from >> kgd2kfd_init_zone_device() >> >> Balbir Singh >> > Good guess! I inserted a printk into kgd2kfd_init_zone_device(): > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c > b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c > index d05d199b5e44..201220e2ac42 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c > @@ -1049,6 +1049,8 @@ int kgd2kfd_init_zone_device(struct amdgpu_device *adev) > pgmap->range.end = res->end; > pgmap->type = MEMORY_DEVICE_PRIVATE; > } > + dev_info(adev->dev, "%s: range.start = 0x%llx ranges.end = 0x%llx\n", > + __func__, pgmap->range.start, pgmap->range.end); > > pgmap->nr_range = 1; > pgmap->ops = &svm_migrate_pgmap_ops; > > > and get this in the case without nokaslr: > > [ T367] amdgpu 0000:03:00.0: kfd_migrate: kgd2kfd_init_zone_device: > range.start = 0xafe00000000 ranges.end = 0xaffffffffff > > and this in the case with nokaslr: > > [ T365] amdgpu 0000:03:00.0: kfd_migrate: kgd2kfd_init_zone_device: > range.start = 0x3ffe00000000 ranges.end = 0x3fffffffffff >
So we should ignore the second region then for the purposes of this issue. I think this now boils down to Why is the dma_get_required_mask set to all of addressable memory (46 bits) when we have nokaslr Can you add some debug around that. Thanks Balbir