On 7/27/2022 12:30 PM, Jane Chu wrote: > On 7/27/2022 12:24 PM, Jane Chu wrote: >> On 7/27/2022 11:56 AM, Dan Williams wrote: >>> Jane Chu wrote: >>>> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine >>>> poison granularity") that changed nfit_handle_mce() callback to report >>>> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been >>>> discovered that the mce->misc LSB field is 0x1000 bytes, hence >>>> injecting >>>> 2 back-to-back poisons and the driver ends up logging 8 badblocks, >>>> because 0x1000 bytes is 8 512-byte. >>>> >>>> Dan Williams noticed that apei_mce_report_mem_error() hardcode >>>> the LSB field to PAGE_SHIFT instead of consulting the input >>>> struct cper_sec_mem_err record. So change to rely on hardware whenever >>>> support is available. >>>> >>>> Link: >>>> https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8...@oracle.com >>>> >>>> >>>> Reviewed-by: Dan Williams <dan.j.willi...@intel.com> >>>> Signed-off-by: Jane Chu <jane....@oracle.com> >>>> --- >>>> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++- >>>> 1 file changed, 13 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/arch/x86/kernel/cpu/mce/apei.c >>>> b/arch/x86/kernel/cpu/mce/apei.c >>>> index 717192915f28..26d63818b2de 100644 >>>> --- a/arch/x86/kernel/cpu/mce/apei.c >>>> +++ b/arch/x86/kernel/cpu/mce/apei.c >>>> @@ -29,15 +29,27 @@ >>>> void apei_mce_report_mem_error(int severity, struct >>>> cper_sec_mem_err *mem_err) >>>> { >>>> struct mce m; >>>> + int grain = PAGE_SHIFT; >>>> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA)) >>>> return; >>>> + /* >>>> + * Even if the ->validation_bits are set for address mask, >>>> + * to be extra safe, check and reject an error radius '0', >>>> + * and fallback to the default page size. >>>> + */ >>>> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) { >>>> + grain = ~mem_err->physical_addr_mask + 1; >>>> + if (grain == 1) >>>> + grain = PAGE_SHIFT; >>> >>> Wait, if @grain is the number of bits to mask off the address, shouldn't >>> this be something like: >>> >>> grain = min_not_zero(PAGE_SHIFT, >>> hweight64(~mem_err->physical_addr_mask)); >> >> I see. I guess what you meant is >> grain = min(PAGE_SHIFT, (1 + >> hweight64(~mem_err->physical_addr_mask))); > > Sorry, take that back, it won't work either.
This will work, grain = min_not_zero(PAGE_SHIFT - 1, hweight64(~mem_err->physical_addr_mask)); grain++; but too sophisticated? I guess I prefer the simple "if" expression. thanks, -jane > > -jane > >> so that in the pmem poison case, 'grain' would be 8, not 7. >> >> thanks, >> -jane >> >>> >>> ...? >> >