For all unrecoverable errors we are missing to log the error, Since machine_check_log_err() is not getting called for unrecoverable errors.
Raise irq work in save_mce_event() for unrecoverable errors, So that we log the error from MCE event handling block in timer handler. Log without this change MCE: CPU27: machine check (Severe) Real address Load/Store (foreign/control memory) [Not recovered] MCE: CPU27: PID: 10580 Comm: inject-ra-err NIP: [0000000010000df4] MCE: CPU27: Initiator CPU MCE: CPU27: Unknown Log with this change MCE: CPU24: machine check (Severe) Real address Load/Store (foreign/control memory) [Not recovered] MCE: CPU24: PID: 1589811 Comm: inject-ra-err NIP: [0000000010000e48] MCE: CPU24: Initiator CPU MCE: CPU24: Unknown RTAS: event: 5, Type: Platform Error (224), Severity: 3 Signed-off-by: Ganesh Goudar <ganes...@linux.ibm.com> Reviewed-by: Mahesh Salgaonkar <mah...@linux.ibm.com> --- V2: Rephrasing the commit message. --- arch/powerpc/kernel/mce.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index 6c5d30fba766..a1cb2172eb7b 100644 --- a/arch/powerpc/kernel/mce.c +++ b/arch/powerpc/kernel/mce.c @@ -131,6 +131,13 @@ void save_mce_event(struct pt_regs *regs, long handled, if (mce->error_type == MCE_ERROR_TYPE_UE) mce->u.ue_error.ignore_event = mce_err->ignore_event; + /* + * Raise irq work, So that we don't miss to log the error for + * unrecoverable errors. + */ + if (mce->disposition == MCE_DISPOSITION_NOT_RECOVERED) + mce_irq_work_queue(); + if (!addr) return; @@ -235,7 +242,6 @@ static void machine_check_ue_event(struct machine_check_event *evt) evt, sizeof(*evt)); /* Queue work to process this event later. */ - mce_irq_work_queue(); } /* -- 2.38.1