Problem: After GPU reset on dGPUs with gfx8 compute ring
1.0.0 fails to pass the ring test. Ring registers inspection
shows that it's active and no hang is observed (rptr == wptr)
No significant diffs were observed between CP_HQD* registers
for the ring in good and bad shape.

Fix: No clear reason why but reversing the order of ring tests
fixes the problem.

Signed-off-by: Andrey Grodzovsky <andrey.grodzov...@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index b2e1376..02f8ca5 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -4811,8 +4811,10 @@ static int gfx_v8_0_kcq_resume(struct amdgpu_device 
*adev)
        if (r)
                goto done;
 
-       /* Test KCQs */
-       for (i = 0; i < adev->gfx.num_compute_rings; i++) {
+       /* Test KCQs - reversing the order of rings seems to fix ring test 
failure
+        * after GPU reset
+        */
+       for (i = adev->gfx.num_compute_rings - 1; i >= 0; i--) {
                ring = &adev->gfx.compute_ring[i];
                r = amdgpu_ring_test_helper(ring);
        }
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to