[Public] > -----Original Message----- > From: amd-gfx <[email protected]> On Behalf Of Sunday > Clement > Sent: Friday, October 17, 2025 10:33 AM > To: [email protected] > Cc: Kasiviswanathan, Harish <[email protected]>; Kuehling, > Felix <[email protected]>; Clement, Sunday <[email protected]> > Subject: [PATCH] drm/amdkfd: Fix nullpointer dereference > > In the event no device is found with the given proximity domain and > kfd_topology_device_by_proximity_domain_no_lock() returns a null device > immediately checking !peer_Dev->gpu will result in a null pointer > dereference. > > Signed-off-by: Sunday Clement <[email protected]> > --- > drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c > b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c > index 4a7180b46b71..6093d96c5892 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c > @@ -2357,7 +2357,7 @@ static int kfd_create_vcrat_image_gpu(void > *pcrat_image, > if (kdev->kfd->hive_id) { > for (nid = 0; nid < proximity_domain; ++nid) { > peer_dev = > kfd_topology_device_by_proximity_domain_no_lock(nid); > - if (!peer_dev->gpu) > + if (!peer_dev || !peer_dev->gpu)
Is this a real failure? If so, we should figure out why our assumption that proximity domain ids as a counter for valid devices should work but actually don't. Either way, probably better to throw an error (something like -ENODEV) rather than continue since IO link data has now been assigned garbage and we probably don't want to keep building the hive at this point. Jon > continue; > if (peer_dev->gpu->kfd->hive_id != kdev->kfd->hive_id) > continue; > -- > 2.43.0
