[AMD Official Use Only - AMD Internal Distribution Only]

It's okay to only check scrub bit so the check includes all the scenarios 
rather than solely for poison creation. Please also update the kernel message 
to "hardware error logged by the scrubber"

Regards,
Hawking

-----Original Message-----
From: Liu, Xiang(Dean) <xiang....@amd.com>
Sent: Friday, April 18, 2025 15:32
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking <hawking.zh...@amd.com>; Liu, Xiang(Dean) <xiang....@amd.com>
Subject: [PATCH] drm/amdgpu: Print kernel message when error logged by scrub

Print a kernel message when the scrub bit of status register is set to indicate 
that errors are being logged by the scrub.

Signed-off-by: Xiang Liu <xiang....@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c
index b4ad163f42a7..2b7b3abdbfc7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c
@@ -120,6 +120,10 @@ static void aca_smu_bank_dump(struct amdgpu_device *adev, 
int idx, int total, st
        for (i = 0; i < ARRAY_SIZE(aca_regs); i++)
                RAS_EVENT_LOG(adev, event_id, HW_ERR 
"ACA[%02d/%02d].%s=0x%016llx\n",
                              idx + 1, total, aca_regs[i].name, 
bank->regs[aca_regs[i].reg_idx]);
+
+       if (ACA_BANK_ERR_IS_DEFFERED(bank) &&
+           ACA_REG__STATUS__SCRUB(bank->regs[ACA_REG_IDX_STATUS]))
+               RAS_EVENT_LOG(adev, event_id, HW_ERR "Error logged by scrub\n");
 }

 static int aca_smu_get_valid_aca_banks(struct amdgpu_device *adev, enum 
aca_smu_type type,
--
2.34.1

Reply via email to