On 7/27/2022 11:56 AM, Dan Williams wrote: > Jane Chu wrote: >> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine >> poison granularity") that changed nfit_handle_mce() callback to report >> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been >> discovered that the mce->misc LSB field is 0x1000 bytes, hence injecting >> 2 back-to-back poisons and the driver ends up logging 8 badblocks, >> because 0x1000 bytes is 8 512-byte. >> >> Dan Williams noticed that apei_mce_report_mem_error() hardcode >> the LSB field to PAGE_SHIFT instead of consulting the input >> struct cper_sec_mem_err record. So change to rely on hardware whenever >> support is available. >> >> Link: >> https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8...@oracle.com >> >> Reviewed-by: Dan Williams <dan.j.willi...@intel.com> >> Signed-off-by: Jane Chu <jane....@oracle.com> >> --- >> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++- >> 1 file changed, 13 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c >> index 717192915f28..26d63818b2de 100644 >> --- a/arch/x86/kernel/cpu/mce/apei.c >> +++ b/arch/x86/kernel/cpu/mce/apei.c >> @@ -29,15 +29,27 @@ >> void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err >> *mem_err) >> { >> struct mce m; >> + int grain = PAGE_SHIFT; >> >> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA)) >> return; >> >> + /* >> + * Even if the ->validation_bits are set for address mask, >> + * to be extra safe, check and reject an error radius '0', >> + * and fallback to the default page size. >> + */ >> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) { >> + grain = ~mem_err->physical_addr_mask + 1; >> + if (grain == 1) >> + grain = PAGE_SHIFT; > > Wait, if @grain is the number of bits to mask off the address, shouldn't > this be something like: > > grain = min_not_zero(PAGE_SHIFT, > hweight64(~mem_err->physical_addr_mask));
I see. I guess what you meant is grain = min(PAGE_SHIFT, (1 + hweight64(~mem_err->physical_addr_mask))); so that in the pmem poison case, 'grain' would be 8, not 7. thanks, -jane > > ...?