RE: [PATCH] acpi/nfit: badrange report spill over to clean range

Dan Williams Tue, 12 Jul 2022 17:48:40 -0700

Jane Chu wrote:
> Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine poison
> granularity") changed nfit_handle_mce() callback to report badrange for
> each poison at an alignment indicated by 1ULL << MCI_MISC_ADDR_LSB(mce->misc)
> instead of the hardcoded L1_CACHE_BYTES. However recently on a server
> populated with Intel DCPMEM v2 dimms, it appears that
> 1UL << MCI_MISC_ADDR_LSB(mce->misc) turns out is 4KiB, or 8 512-byte blocks.
> Consequently, injecting 2 back-to-back poisons via ndctl, and it reports
> 8 poisons.
> 
> [29076.590281] {3}[Hardware Error]:   physical_address: 0x00000040a0602400
> [..]
> [29076.619447] Memory failure: 0x40a0602: recovery action for dax page: 
> Recovered
> [29076.627519] mce: [Hardware Error]: Machine check events logged
> [29076.634033] nfit ACPI0012:00: addr in SPA 1 (0x4080000000, 0x1f80000000)
> [29076.648805] nd_bus ndbus0: XXX nvdimm_bus_add_badrange: (0x40a0602000, 
> 0x1000)
> [..]
> [29078.634817] {4}[Hardware Error]:   physical_address: 0x00000040a0602600
> [..]
> [29079.595327] nfit ACPI0012:00: addr in SPA 1 (0x4080000000, 0x1f80000000)
> [29079.610106] nd_bus ndbus0: XXX nvdimm_bus_add_badrange: (0x40a0602000, 
> 0x1000)
> [..]
> {
>   "dev":"namespace0.0",
>   "mode":"fsdax",
>   "map":"dev",
>   "size":33820770304,
>   "uuid":"a1b0f07f-747f-40a8-bcd4-de1560a1ef75",
>   "sector_size":512,
>   "align":2097152,
>   "blockdev":"pmem0",
>   "badblock_count":8,
>   "badblocks":[
>     {
>       "offset":8208,
>       "length":8,
>       "dimms":[
>         "nmem0"
>       ]
>     }
>   ]
> }
> 
> So, 1UL << MCI_MISC_ADDR_LSB(mce->misc) is an unreliable indicator for poison
> radius and shouldn't be used.  More over, as each injected poison is being
> reported independently, any alignment under 512-byte appear works:
> L1_CACHE_BYTES (though inaccurate), or 256-bytes (as ars->length reports),
> or 512-byte.
> 
> To get around this issue, 512-bytes is chosen as the alignment because
>   a. it happens to be the badblock granularity,
>   b. ndctl inject-error cannot inject more than one poison to a 512-byte 
> block,
>   c. architecture agnostic


I am failing to see the kernel bug? Yes, you injected less than 8
"badblocks" of poison and the hardware reported 8 blocks of poison, but
that's not the kernel's fault, that's the hardware. What happens when
hardware really does detect 8 blocks of consective poison and this
implementation decides to only record 1 at a time?

It seems the fix you want is for the hardware to report the precise
error bounds and that 1UL << MCI_MISC_ADDR_LSB(mce->misc) does not have
that precision in this case.

However, the ARS engine likely can return the precise error ranges so I
think the fix is to just use the address range indicated by 1UL <<
MCI_MISC_ADDR_LSB(mce->misc) to filter the results from a short ARS
scrub request to ask the device for the precise error list.

RE: [PATCH] acpi/nfit: badrange report spill over to clean range

Reply via email to