Jane Chu wrote: > On 7/27/2022 12:30 PM, Jane Chu wrote: > > On 7/27/2022 12:24 PM, Jane Chu wrote: > >> On 7/27/2022 11:56 AM, Dan Williams wrote: > >>> Jane Chu wrote: > >>>> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine > >>>> poison granularity") that changed nfit_handle_mce() callback to report > >>>> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been > >>>> discovered that the mce->misc LSB field is 0x1000 bytes, hence > >>>> injecting > >>>> 2 back-to-back poisons and the driver ends up logging 8 badblocks, > >>>> because 0x1000 bytes is 8 512-byte. > >>>> > >>>> Dan Williams noticed that apei_mce_report_mem_error() hardcode > >>>> the LSB field to PAGE_SHIFT instead of consulting the input > >>>> struct cper_sec_mem_err record. So change to rely on hardware whenever > >>>> support is available. > >>>> > >>>> Link: > >>>> https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8...@oracle.com > >>>> > >>>> > >>>> > >>>> Reviewed-by: Dan Williams <dan.j.willi...@intel.com> > >>>> Signed-off-by: Jane Chu <jane....@oracle.com> > >>>> --- > >>>> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++- > >>>> 1 file changed, 13 insertions(+), 1 deletion(-) > >>>> > >>>> diff --git a/arch/x86/kernel/cpu/mce/apei.c > >>>> b/arch/x86/kernel/cpu/mce/apei.c > >>>> index 717192915f28..26d63818b2de 100644 > >>>> --- a/arch/x86/kernel/cpu/mce/apei.c > >>>> +++ b/arch/x86/kernel/cpu/mce/apei.c > >>>> @@ -29,15 +29,27 @@ > >>>> void apei_mce_report_mem_error(int severity, struct > >>>> cper_sec_mem_err *mem_err) > >>>> { > >>>> struct mce m; > >>>> + int grain = PAGE_SHIFT; > >>>> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA)) > >>>> return; > >>>> + /* > >>>> + * Even if the ->validation_bits are set for address mask, > >>>> + * to be extra safe, check and reject an error radius '0', > >>>> + * and fallback to the default page size. > >>>> + */ > >>>> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) { > >>>> + grain = ~mem_err->physical_addr_mask + 1; > >>>> + if (grain == 1) > >>>> + grain = PAGE_SHIFT; > >>> > >>> Wait, if @grain is the number of bits to mask off the address, shouldn't > >>> this be something like: > >>> > >>> grain = min_not_zero(PAGE_SHIFT, > >>> hweight64(~mem_err->physical_addr_mask)); > >> > >> I see. I guess what you meant is > >> grain = min(PAGE_SHIFT, (1 + > >> hweight64(~mem_err->physical_addr_mask))); > > > > Sorry, take that back, it won't work either. > > This will work, > grain = min_not_zero(PAGE_SHIFT - 1, > hweight64(~mem_err->physical_addr_mask)); > grain++; > but too sophisticated? I guess I prefer the simple "if" expression.
An "if" is fine, I was more pointing out that: hweight64(~mem_err->physical_addr_mask) + 1 ...and: ~mem_err->physical_addr_mask + 1; ...give different results.