Mark fences with errors before we reset the rings as we may end up signalling fences as part of the reset sequence. The error needs to be set before the fence is signalled. On GC10 and newer, this isn't a problem since we don't signal any fences. On GC9, we need to signal the fence after the reset to unblock the recovery sequence.
Signed-off-by: Alex Deucher <[email protected]> --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c index 600e6bb98af7a..5defdebd7091e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c @@ -872,6 +872,10 @@ void amdgpu_ring_reset_helper_begin(struct amdgpu_ring *ring, drm_sched_wqueue_stop(&ring->sched); /* back up the non-guilty commands */ amdgpu_ring_backup_unprocessed_commands(ring, guilty_fence); + /* signal the guilty fence and set an error on all fences from the context */ + if (guilty_fence) + amdgpu_fence_driver_guilty_force_completion(guilty_fence); + } int amdgpu_ring_reset_helper_end(struct amdgpu_ring *ring, @@ -885,9 +889,6 @@ int amdgpu_ring_reset_helper_end(struct amdgpu_ring *ring, if (r) return r; - /* signal the guilty fence and set an error on all fences from the context */ - if (guilty_fence) - amdgpu_fence_driver_guilty_force_completion(guilty_fence); /* Re-emit the non-guilty commands */ if (ring->ring_backup_entries_to_copy) { amdgpu_ring_alloc_reemit(ring, ring->ring_backup_entries_to_copy); -- 2.52.0
