gfx9: Implement KGQ ring reset

Timur Kristóf Wed, 17 Dec 2025 21:28:09 -0800

On 2025. december 17., szerda 23:11:45 középső államokbeli zónaidő Timur 
Kristóf wrote:
> On 2025. december 15., hétfő 10:07:11 középső államokbeli zónaidő Alex
> Deucher
> wrote:
> > GFX ring resets work differently on pre-GFX10 hardware since
> > there is no MQD managed by the scheduler.
> > For ring reset, you need issue the reset via CP_VMID_RESET
> > via KIQ or MMIO and submit the following to the gfx ring to
> > complete the reset:
> > 1. EOP packet with EXEC bit set
> > 2. WAIT_REG_MEM to wait for the fence
> > 3. Clear CP_VMID_RESET to 0
> > 4. EVENT_WRITE ENABLE_LEGACY_PIPELINE
> > 5. EOP packet with EXEC bit set
> > 6. WAIT_REG_MEM to wait for the fence
> > Once those commands have completed the reset should
> > be complete and the ring can accept new packets.
> > 
> > However, because we have a pipeline sync between jobs,
> > the PFP is waiting on the fence from the bad job to signal so
> > it can't process any of the packets in the reset sequence
> > until that pipeline sync clears.  To unblock the PFP, we
> > use the KIQ to signal the fence after we reset the queue.
> > 
> > Signed-off-by: Alex Deucher <[email protected]>
> > ---
> > 
> >  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 104 +++++++++++++++++++++++++-
> >  1 file changed, 101 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index bb1465a98c7ca..9b7073650315e
> > 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > @@ -2410,8 +2410,10 @@ static int gfx_v9_0_sw_init(struct amdgpu_ip_block
> > *ip_block) amdgpu_get_soft_full_reset_mask(&adev->gfx.gfx_ring[0]);
> > 
> >     adev->gfx.compute_supported_reset =
> >     
> >             amdgpu_get_soft_full_reset_mask(&adev-
> >
> >gfx.compute_ring[0]);
> >
> > -   if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset)
> > +   if (!amdgpu_sriov_vf(adev) && !adev->debug_disable_gpu_ring_reset)
> 
> {
> 
> >             adev->gfx.compute_supported_reset |=
> 
> AMDGPU_RESET_TYPE_PER_QUEUE;
> 
> > +           adev->gfx.gfx_supported_reset |=
> 
> AMDGPU_RESET_TYPE_PER_QUEUE;
> 
> > +   }
> > 
> >     r = amdgpu_gfx_kiq_init(adev, GFX9_MEC_HPD_SIZE, 0);
> >     if (r) {
> > 
> > @@ -7163,6 +7165,103 @@ static void gfx_v9_ring_insert_nop(struct
> > amdgpu_ring *ring, uint32_t num_nop) amdgpu_ring_insert_nop(ring, num_nop
> > -
> > 1);
> > 
> >  }
> > 
> > +static void gfx_v9_0_ring_emit_wreg_me(struct amdgpu_ring *ring,
> > +                                  uint32_t reg,
> > +                                  uint32_t val)
> > +{
> > +   uint32_t cmd = 0;
> > +
> > +   switch (ring->funcs->type) {
> > +   case AMDGPU_RING_TYPE_KIQ:
> > +           cmd = (1 << 16); /* no inc addr */
> 
> What do you mean by "inc addr" in this context?
> 
> > +           break;
> > +   default:
> > +           cmd = WR_CONFIRM;
> > +           break;
> > +   }
> > +   amdgpu_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3));
> > +   amdgpu_ring_write(ring, cmd);
> > +   amdgpu_ring_write(ring, reg);
> > +   amdgpu_ring_write(ring, 0);
> > +   amdgpu_ring_write(ring, val);
> > +}
> > +
> > +static void gfx_v9_0_ring_emit_event_write(struct amdgpu_ring *ring,
> > +                                      uint32_t 
event_type,
> > +                                      uint32_t
> 
> event_index)
> 
> > +{
> > +   amdgpu_ring_write(ring, PACKET3(PACKET3_EVENT_WRITE, 0));
> > +   amdgpu_ring_write(ring, EVENT_TYPE(event_type) |
> > +                     EVENT_INDEX(event_index));
> > +}
> > +
> > +static int gfx_v9_0_reset_kgq(struct amdgpu_ring *ring,
> > +                         unsigned int vmid,
> > +                         struct amdgpu_fence *timedout_fence)
> > +{
> > +   struct amdgpu_device *adev = ring->adev;
> > +   struct amdgpu_kiq *kiq = &adev->gfx.kiq[0];
> > +   struct amdgpu_ring *kiq_ring = &kiq->ring;
> > +   unsigned long flags;
> > +   u32 tmp;
> > +   int r;
> > +
> > +   amdgpu_ring_reset_helper_begin(ring, timedout_fence);
> > +
> > +   spin_lock_irqsave(&kiq->ring_lock, flags);
> > +
> > +   if (amdgpu_ring_alloc(kiq_ring, 5 + 5)) {
> > +           spin_unlock_irqrestore(&kiq->ring_lock, flags);
> > +           return -ENOMEM;
> > +   }
> > +
> > +   /* send the reset - 5 */
> > +   tmp = REG_SET_FIELD(0, CP_VMID_RESET, RESET_REQUEST, 1 << vmid);
> > +   gfx_v9_0_ring_emit_wreg(kiq_ring,
> > +                           SOC15_REG_OFFSET(GC, 0,
> 
> mmCP_VMID_RESET), tmp);
> 
> > +   /* emit the fence to clear the pipeline sync - 5 */
> > +   gfx_v9_0_ring_emit_fence_kiq(kiq_ring, ring->fence_drv.gpu_addr,
> > +                                timedout_fence->base.seqno,
> 
> 0);
> 
> As far as I see, this isn't going to work when sched_hw_submission > 2 and
> there are more than two jobs (from various different userspace processes)
> emitted in the ring.
> 
> I can think of two possible solutons:
> - Emit each fence individually, with a short delay in between to give a
> chance to the GFX ring to catch up with the KIQ.
> - Change the wait_reg_mem command used for the pipeline sync to allow
> greater than equal instead of just equal. Then it's enough to signal just
> the last fence on the KIQ ring.


On a second thought, both of my suggestions are wrong because they would cause 
the fences of non-guilty jobs to be emitted before the jobs are actually 
complete.

It occurs to me that we could change the gfx pipeline sync to not rely on the 
same fences that signal job completion. Instead we could emit a different fence 
for just the pipeline sync. Then it is no longer a problem to signal those 
during recovery.

> 
> > +   amdgpu_ring_commit(kiq_ring);
> > +   r = amdgpu_ring_test_ring(kiq_ring);
> > +   spin_unlock_irqrestore(&kiq->ring_lock, flags);
> > +   if (r)
> > +           return r;
> > +
> > +   if (amdgpu_ring_alloc(ring, 8 + 7 + 5 + 2 + 8 + 7))
> > +           return -ENOMEM;
> > +   /* emit the fence to finish the reset - 8 */
> > +   ring->trail_seq++;
> > +   gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr,
> > +                            ring->trail_seq,
> 
> AMDGPU_FENCE_FLAG_EXEC);
> 
> > +   /* wait for the fence - 7 */
> > +   gfx_v9_0_wait_reg_mem(ring, 0, 1, 0,
> > +                         lower_32_bits(ring-
> >
> >trail_fence_gpu_addr),
> >
> > +                         upper_32_bits(ring-
> >
> >trail_fence_gpu_addr),
> >
> > +                         ring->trail_seq, 0xffffffff, 4);
> > +   /* clear mmCP_VMID_RESET - 5 */
> > +   gfx_v9_0_ring_emit_wreg_me(ring,
> > +                              SOC15_REG_OFFSET(GC, 0,
> 
> mmCP_VMID_RESET), 0);
> 
> > +   /* event write ENABLE_LEGACY_PIPELINE - 2 */
> > +   gfx_v9_0_ring_emit_event_write(ring, ENABLE_LEGACY_PIPELINE, 0);
> > +   /* emit a regular fence - 8 */
> > +   ring->trail_seq++;
> > +   gfx_v9_0_ring_emit_fence(ring, ring->trail_fence_gpu_addr,
> > +                            ring->trail_seq,
> 
> AMDGPU_FENCE_FLAG_EXEC);
> 
> > +   /* wait for the fence - 7 */
> > +   gfx_v9_0_wait_reg_mem(ring, 1, 1, 0,
> > +                         lower_32_bits(ring-
> >
> >trail_fence_gpu_addr),
> >
> > +                         upper_32_bits(ring-
> >
> >trail_fence_gpu_addr),
> >
> > +                         ring->trail_seq, 0xffffffff, 4);
> 
> Why is it necessary to emit (and wait for) a regular fence here?
> I'm not against it, just curious why it's needed.
> 
> > +   amdgpu_ring_commit(ring);
> > +   /* wait for the commands to complete */
> > +   r = amdgpu_ring_test_ring(ring);
> > +   if (r)
> > +           return r;
> > +
> > +   return amdgpu_ring_reset_helper_end(ring, timedout_fence);
> > +}
> > +
> > 
> >  static int gfx_v9_0_reset_kcq(struct amdgpu_ring *ring,
> >  
> >                           unsigned int vmid,
> >                           struct amdgpu_fence *timedout_fence)
> > 
> > @@ -7441,9 +7540,9 @@ static const struct amdgpu_ring_funcs
> > gfx_v9_0_ring_funcs_gfx = { .emit_wreg = gfx_v9_0_ring_emit_wreg,
> > 
> >     .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait,
> >     .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait,
> > 
> > -   .soft_recovery = gfx_v9_0_ring_soft_recovery,
> 
> Can you please split removing the soft recovery into a separate patch?
> 
> Can we talk about removing the soft recovery? For the other chips where it
> has already been removed, it is percieved by users as a regression.
> 
> >     .emit_mem_sync = gfx_v9_0_emit_mem_sync,
> >     .emit_cleaner_shader = gfx_v9_0_ring_emit_cleaner_shader,
> > 
> > +   .reset = gfx_v9_0_reset_kgq,
> > 
> >     .begin_use = amdgpu_gfx_enforce_isolation_ring_begin_use,
> >     .end_use = amdgpu_gfx_enforce_isolation_ring_end_use,
> >  
> >  };
> > 
> > @@ -7542,7 +7641,6 @@ static const struct amdgpu_ring_funcs
> > gfx_v9_0_ring_funcs_compute = { .emit_wreg = gfx_v9_0_ring_emit_wreg,
> > 
> >     .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait,
> >     .emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait,
> > 
> > -   .soft_recovery = gfx_v9_0_ring_soft_recovery,
> > 
> >     .emit_mem_sync = gfx_v9_0_emit_mem_sync,
> >     .emit_wave_limit = gfx_v9_0_emit_wave_limit,
> >     .reset = gfx_v9_0_reset_kcq,

Re: [PATCH 6/6] drm/amdgpu/gfx9: Implement KGQ ring reset

Reply via email to