This set improves per queue reset support for GC10+. This uses vmid resets for GFX. GFX resets all state associated with a vmid and then continues where it left off. Since once the IB uses the vmid, only the IB is reset and execution continues after the IB. Tested on GC 10 and 11 chips with a game running and then running hang tests. The game pauses when the hang happens, then continues after the queue reset.
I tried this same approach and GC8 and 9, but it was not as reliable as soft recovery. I also compared this to Christian's reset patches, but I was not able to make them work as reliably as this series. Alex Deucher (9): Revert "drm/amd/amdgpu: add pipe1 hardware support" drm/amdgpu: add AMDGPU_QUEUE_RESET_TIMEOUT drm/amdgpu: set the exec flag on the IB fence drm/amdgpu/gfx11: adjust ring reset sequences drm/amdgpu/gfx11: drop soft recovery drm/amdgpu/gfx12: adjust ring reset sequences drm/amdgpu/gfx12: drop soft recovery drm/amdgpu/gfx10: adjust ring reset sequences drm/amdgpu/gfx10: drop soft recovery Christian König (1): drm/amdgpu: rework queue reset scheduler interaction drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 26 ++++++++-------- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 41 ++++++++----------------- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 35 ++++++--------------- drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 35 ++++++--------------- drivers/gpu/drm/amd/amdgpu/nvd.h | 1 + 7 files changed, 50 insertions(+), 92 deletions(-) -- 2.49.0