On 7/15/2022 12:17 PM, Dan Williams wrote: > [ add Tony ] > > Jane Chu wrote: >> On 7/14/2022 6:19 PM, Dan Williams wrote: >>> Jane Chu wrote: >>>> I meant to say there would be 8 calls to the nfit_handle_mce() callback, >>>> one call for each poison with accurate address. >>>> >>>> Also, short ARS would find 2 poisons. >>>> >>>> I attached the console output, my annotation is prefixed with "<==". >>> >>> [29078.634817] {4}[Hardware Error]: physical_address: 0x00000040a0602600 >>> <== 2nd poison @ 0x600 >>> [29078.642200] {4}[Hardware Error]: physical_address_mask: >>> 0xffffffffffffff00 >>> >>> Why is nfit_handle_mce() seeing a 4K address mask when the CPER record >>> is seeing a 256-byte address mask? >> >> Good question! One would think both GHES reporting and >> nfit_handle_mce() are consuming the same mce record... >> Who might know? > > Did some grepping... > > Have a look at: apei_mce_report_mem_error() > > "The call is coming from inside the house!" > > Luckily we do not need to contact a BIOS engineer to get this fixed.
Great, thank you! Just put together a quick fix for review after I tested it. > >>> Sigh, is this "firmware-first" causing the kernel to get bad information >>> via the native mechanisms > >>> I would expect that if this test was truly worried about minimizing BIOS >>> latency it would disable firmware-first error reporting. I wonder if >>> that fixes the observed problem? >> >> Could you elaborate on firmware-first error please? What are the >> possible consequences disabling it? and how to disable it? > > With my Linux kernel developer hat on, firmware-first error handling is > really only useful for supporting legacy operating systems that do not > have native machine check handling, or for platforms that have bugs that > would otherwise cause OS native error handling to fail. Otherwise, for > modern Linux, firmware-first error handling is pure overhead and a > source of bugs. > > In this case the bug is in the Linux code that translates the ACPI event > back into an MCE record. Thanks! -jane