[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Hawking Zhang <hawking.zh...@amd.com>
Regards, Hawking -----Original Message----- From: Lazar, Lijo <lijo.la...@amd.com> Sent: Wednesday, June 4, 2025 12:10 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking <hawking.zh...@amd.com>; Deucher, Alexander <alexander.deuc...@amd.com>; Zhou1, Tao <tao.zh...@amd.com> Subject: [PATCH] drm/amdgpu: Clear reset flags from ras context Once RAS errors are cleared with appropriate recovery mechanism, clear reset flags also from RAS context. Otherwise, stale flag values could affect the subsequent RAS reset handling on the device. Signed-off-by: Lijo Lazar <lijo.la...@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index b275e464ae4f..b14d08f8feba 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -4415,8 +4415,10 @@ void amdgpu_ras_clear_err_state(struct amdgpu_device *adev) struct amdgpu_ras *ras; ras = amdgpu_ras_get_context(adev); - if (ras) + if (ras) { ras->ras_err_state = 0; + ras->gpu_reset_flags = 0; + } } void amdgpu_ras_set_err_poison(struct amdgpu_device *adev, -- 2.25.1