On Thu, Apr 18, 2019 at 5:26 PM Borislav Petkov <b...@alien8.de> wrote: > > Now, if any of that above still doesn't make it clear, please state what > you're trying to achieve and I'll try to help.
Sorry that I misled you to believe we don't even enable CONFIG_X86_MCELOG_LEGACY. Here is what we have and what we have tried: 1. We have CONFIG_X86_MCELOG_LEGACY=y 2. We also have CONFIG_RAS=y and CONFIG_RAS_CEC=y 3. mcelog started as a daemon successfully, like before 4. Some real correctable memory errors happened, as logged in dmesg 5. mcelog couldn't receive any of them, reported 0 errors 6. Admin's complained to us as they believe this is a kernel bug 7. We dug into kernel source code and found out CONFIG_RAS hijacks all these errors, by stopping there in the notification chain: static int mce_first_notifier(struct notifier_block *nb, unsigned long val, void *data) { struct mce *m = (struct mce *)data; if (!m) return NOTIFY_DONE; if (cec_add_mce(m)) return NOTIFY_STOP; // <=== Returns and stops here /* Emit the trace record: */ trace_mce_record(m); set_bit(0, &mce_need_notify); mce_notify_irq(); // <=== There is where MCELOG receives return NOTIFY_DONE; } 8. I noticed rasdaemon, and tried to start it instead of mcelog. 9. I injected some memory error and could successfully read them via ras-mc-ctl. To demonstrate what I think we should have, here is the PoC code ONLY to show the idea (please don't judge it): @ -567,12 +567,12 @@ static int mce_first_notifier(struct notifier_block *nb, unsigned long val, void *data) { struct mce *m = (struct mce *)data; + bool consumed; if (!m) return NOTIFY_DONE; - if (cec_add_mce(m)) - return NOTIFY_STOP; + consumed = cec_add_mce(m); /* Emit the trace record: */ trace_mce_record(m); @@ -581,7 +581,7 @@ static int mce_first_notifier(struct notifier_block *nb, unsigned long val, mce_notify_irq(); - return NOTIFY_DONE; + return consumed ? NOTIFY_STOP : NOTIFY_DONE; } With this change, although not even compiled, mcelog should still receive correctable memory errors like before, even when we have CONFIG_RAS_CEC=y. Does this make any sense to you? Thanks!