On 6/19/24 2:24 AM, Shiyang Ruan wrote:
>
>
> 在 2024/6/19 7:35, Dave Jiang 写道:
>>
>>
>> On 6/18/24 9:53 AM, Shiyang Ruan wrote:
>>> Background:
>>> Since CXL device is a memory device, while CPU consumes a poison page of
>>> CXL device, it always triggers a MCE by interrupt (INT18), no matter
>>> which-First path is configured. This is the first report. Then
>>> currently, in FW-First path, the poison event is transferred according
>>> to the following process: CXL device -> firmware -> OS:ACPI->APEI->GHES
>>> -> CPER -> trace report. This is the second one. These two reports
>>> are indicating the same poisoning page, which is the so-called "duplicate
>>> report"[1]. And the memory_failure() handling I'm trying to add in
>>> OS-First path could also be another duplicate report.
>>>
>>> Hope the flow below could make it easier to understand:
>>> CPU accesses bad memory on CXL device, then
>>> -> MCE (INT18), *always* report (1)
>>> -> * FW-First (implemented now)
>>> -> CXL device -> FW
>>> -> OS:ACPI->APEI->GHES->CPER -> trace report (2.a)
>>> * OS-First (not implemented yet, I'm working on it)
>>> -> CXL device -> MSI
>>> -> OS:CXL driver -> memory_failure() (2.b)
>>> so, the (1) and (2.a/b) are duplicated.
>>>
>>> (I didn't get response in my reply for [1] while I have to make patch to
>>> solve this problem, so please correct me if my understanding is wrong.)
>>>
>>> This patch adds a new notifier_block and MCE_PRIO_CXL, for CXL memdev
>>> to check whether the current poison page has been reported (if yes,
>>> stop the notifier chain, won't call the following memory_failure()
>>> to report), into `x86_mce_decoder_chain`. In this way, if the poison
>>> page already handled(recorded and reported) in (1) or (2), the other one
>>> won't duplicate the report. The record could be clear when
>>> cxl_clear_poison() is called.
>>>
>>> [1]
>>> https://lore.kernel.org/linux-cxl/664d948fb86f0_e8be29...@dwillia2-mobl3.amr.corp.intel.com.notmuch/
>>>
>
> ...
>
>>> +
>>> +static bool cxl_contains_hpa(const struct cxl_memdev *cxlmd, u64 hpa)
>>> +{
>>> + struct cxl_contains_hpa_context ctx = {
>>> + .contains = false,
>>> + .hpa = hpa,
>>> + };
>>> + struct cxl_port *port;
>>> +
>>> + port = cxlmd->endpoint;
>>> + if (port && is_cxl_endpoint(port) && cxl_num_decoders_committed(port))
>>
>> Maybe no need to check is_cxl_endpoint() if the port is retrieved from
>> cxlmd->endpoint.
>
> OK, I'll remove this.
>
>>
>> Also, in order to use cxl_num_decoders_committed(), cxl_region_rwsem must be
>> held. See the lockdep_assert_held() in the function. Maybe add a
>> guard(cxl_regoin_rwsem);
>> before the if statement above.
>
> Got it. I didn't realize it before. Will add it.
>
>
> BTW, may I have your opinion on this proposal? I'm not sure if the
> Background and problem described above are correct or not. If not, it could
> lead me in the wrong direction.
Patch looks ok to me, but I'm no RAS expert in this area. Lets wait for
comments from Jonathan and Dan.
>
> Thank you very much!
>
>
> --
> Ruan.
>
>>
>> DJ
>>