[Public]

Regards,
      Prike

> -----Original Message-----
> From: Koenig, Christian <christian.koe...@amd.com>
> Sent: Friday, July 11, 2025 8:13 PM
> To: Liang, Prike <prike.li...@amd.com>; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander <alexander.deuc...@amd.com>
> Subject: Re: [PATCH v6 07/11] drm/amdgpu: validate userq's last fence prior to
> destroying
>
> On 11.07.25 11:39, Prike Liang wrote:
> > The userq requires validating queue status before destroying it, if
> > user tries to destroy a busy userq by IOCTL then the driver should
> > report an error for this illegal usage.
>
> Clear NAK, destroying a busy userqueue is perfectly valid!
Yes, the firmware should handle such case something like as preempting the 
queue.
If we directly unmap a hang queue and may further cause the MES firmware hang 
up,
so, do we need to detect the hang userq here by checking the userq fence status 
and reset the hang queue before further performs the unmap queue?

> Regards,
> Christian.
>
> >
> > Signed-off-by: Prike Liang <prike.li...@amd.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 15 ++++++++++++---
> >  1 file changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> > index 81fbb00b6d91..bcbe8d3f66ed 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c
> > @@ -281,7 +281,7 @@ amdgpu_userq_map_helper(struct amdgpu_userq_mgr
> *uq_mgr,
> >     return r;
> >  }
> >
> > -static void
> > +static int
> >  amdgpu_userq_wait_for_last_fence(struct amdgpu_userq_mgr *uq_mgr,
> >                              struct amdgpu_usermode_queue *queue)  { @@ -
> 290,10 +290,14 @@
> > amdgpu_userq_wait_for_last_fence(struct amdgpu_userq_mgr *uq_mgr,
> >
> >     if (f && !dma_fence_is_signaled(f)) {
> >             ret = dma_fence_wait_timeout(f, true, msecs_to_jiffies(100));
> > -           if (ret <= 0)
> > +           if (ret <= 0) {
> >                     drm_file_err(uq_mgr->file, "Timed out waiting for
> fence=%llu:%llu\n",
> >                                  f->context, f->seqno);
> > +                   return -ETIMEDOUT;
> > +           }
> >     }
> > +
> > +   return 0;
> >  }
> >
> >  static void
> > @@ -509,7 +513,12 @@ amdgpu_userq_destroy(struct drm_file *filp, int
> queue_id)
> >             mutex_unlock(&uq_mgr->userq_mutex);
> >             return -EINVAL;
> >     }
> > -   amdgpu_userq_wait_for_last_fence(uq_mgr, queue);
> > +
> > +   if (amdgpu_userq_wait_for_last_fence(uq_mgr, queue)) {
> > +           drm_warn(adev_to_drm(uq_mgr->adev), "Don't destroy a busy
> userq\n");
> > +           /* For the fence signal timeout case, it requires resetting the 
> > busy
> queue.*/
> > +           r = -ETIMEDOUT;
> > +   }
> >
> >     /*
> >      * At this point the userq obj va should be mapped,

Reply via email to