Shiyang Ruan wrote: > The GMER only has "Physical Address" field, no such one indicates length. > So, when a poison event is received, we could use GET_POISON_LIST command > to get the poison list. Now driver has cxl_mem_get_poison(), so > reuse it and add a parameter 'bool report', report poison record to MCE > if set true.
I am not sure I agree with the rationale here because there is no correlation between the event being signaled and the current state of the poison list. It also establishes race between multiple GMER events, i.e. imagine the hardware sends 4 GMER events to communicate a 256B poison discovery event. Does the driver need logic to support GMER event 2, 3, and 4 if it already say all 256B of poison after processing GMER event 1? I think the best the driver can do is assume at least 64B of poison per-event and depend on multiple notifications to handle larger poison lengths. Otherwise, the poison list is really only useful for pre-populating pages to offline after a reboot, i.e. to catch the kernel up with the state of poison pages after a reboot.