[AMD Official Use Only - AMD Internal Distribution Only]

Greetings @Kuehling, Felix

Thanks for the RB.  It all makes sense.  BTW I kept the "opposite" change in 
(with the update commit message),  as it was suggested by you earlier when we 
chatted about private/public API.  I really do love all the subtle details that 
goes into each patches.

One love!

> -----Original Message-----
> From: Kuehling, Felix <[email protected]>
> Sent: Friday, December 5, 2025 11:34 AM
> To: [email protected]; Martin, Andrew <[email protected]>
> Subject: Re: [PATCH v4] drm/amdkfd: FORWARD NULL
>
> Please change the commit headline. Suggestion: "Check for NULL return values"
>
> One more comment inline ...
>
> On 2025-12-03 10:41, Andrew Martin wrote:
> > This patch fixes issues when the code moves forward with a potential
> > NULL pointer, without checking.
> >
> > Signed-off-by: Andrew Martin <[email protected]>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c |  2 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_crat.c            |  2 +-
> >   drivers/gpu/drm/amd/amdkfd/kfd_debug.c           | 12 ++++++++++--
> >   drivers/gpu/drm/amd/amdkfd/kfd_process.c         |  3 ---
> >   4 files changed, 12 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
> > index 193ed8becab8..31b8fa52b42f 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c
> > @@ -107,7 +107,7 @@ static const char
> *amdkfd_fence_get_timeline_name(struct dma_fence *f)
> >   {
> >     struct amdgpu_amdkfd_fence *fence = to_amdgpu_amdkfd_fence(f);
> >
> > -   return fence->timeline_name;
> > +   return fence ? fence->timeline_name : NULL;
> >   }
> >
> >   /**
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> > b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> > index 5f2dd378936e..d1d72cd332fc 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> > @@ -2358,7 +2358,7 @@ static int kfd_create_vcrat_image_gpu(void
> *pcrat_image,
> >     if (kdev->kfd->hive_id) {
> >             for (nid = 0; nid < proximity_domain; ++nid) {
> >                     peer_dev =
> kfd_topology_device_by_proximity_domain_no_lock(nid);
> > -                   if (!peer_dev->gpu)
> > +                   if (!peer_dev || !peer_dev->gpu)
> >                             continue;
> >                     if (peer_dev->gpu->kfd->hive_id != kdev->kfd->hive_id)
> >                             continue;
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> > b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> > index ba9a09b6589a..1cb24092b17e 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
> > @@ -516,9 +516,13 @@ int kfd_dbg_trap_set_flags(struct kfd_process *target,
> uint32_t *flags)
> >     int i, r = 0, rewind_count = 0;
> >
> >     for (i = 0; i < target->n_pdds; i++) {
> > +           uint32_t caps;
> >             struct kfd_topology_device *topo_dev =
> > -                           kfd_topology_device_by_id(target->pdds[i]-
> >dev->id);
> > -           uint32_t caps = topo_dev->node_props.capability;
> > +                   kfd_topology_device_by_id(target->pdds[i]->dev->id);
> > +           if (!topo_dev)
> > +                   return -EINVAL;
> > +
> > +           caps = topo_dev->node_props.capability;
> >
> >             if (!(caps &
> HSA_CAP_TRAP_DEBUG_PRECISE_MEMORY_OPERATIONS_SUPPORTED) &&
> >                     (*flags & KFD_DBG_TRAP_FLAG_SINGLE_MEM_OP)) {
> @@ -1071,6 +1075,10
> > @@ int kfd_dbg_trap_device_snapshot(struct kfd_process *target,
> >     for (i = 0; i < tmp_num_devices; i++) {
> >             struct kfd_process_device *pdd = target->pdds[i];
> >             struct kfd_topology_device *topo_dev =
> > kfd_topology_device_by_id(pdd->dev->id);
> > +           if (!topo_dev) {
> > +                   r = -EINVAL;
> > +                   break;
> > +           }
> >
> >             device_info.gpu_id = pdd->dev->id;
> >             device_info.exception_status = pdd->exception_status; diff --git
> > a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > index aec7522407db..47783803f56f 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > @@ -1763,9 +1763,6 @@ int kfd_process_device_init_vm(struct
> kfd_process_device *pdd,
> >     struct kfd_node *dev;
> >     int ret;
> >
> > -   if (!drm_file)
> > -           return -EINVAL;
> > -
>
> This is kind of the opposite of the rest of this patch. If you want to keep 
> this in
> here, please call it out in the commit description with something like
>
>      "Removed one redundant NULL check for a function parameter. This check is
> already done in the only caller."
>
> With that fixed, the patch is
>
> Reviewed-by: Felix Kuehling <[email protected]>
>
>
> >     if (pdd->drm_priv)
> >             return -EBUSY;
> >

Reply via email to