On Sat, 23 Dec 2023 00:33:43 +0800 Shiyang Ruan <ruansy.f...@fujitsu.com> wrote:
> Hi guys, > > I have some thoughts and questions about CXL & MCE mechanism. +CC qemu-devel as quite bit of this is QEMU related . > > CXL type-3 devices can be used as volatile or persistent memory, so a > poisoned page on them should also trigger a memory failure, to let OS > handle process using the page and let device driver recover the page. I > am now investigating this. > > Currently, CXL RAS is under development. We can now inject POISON on a > CXL device by qemu (qmp commands), and then `cxl list -L` could show > those poisoned areas. But the POISON injection is silent, I think we > need a singal here to notify OS to handle those poisoned areas when > injecting. Agreed. The emulation is far from complete. It should kick off the relevant event log entry additions as well under at least some circumstances (depends whether we think we are injecting poison to be discovered later - which it won't be because we don't check for poison when doing reads and writes - I've not yet figured out how to do that in QEMU). If we are using the inject poison opcode from the host OS then we are missing the bit in 8.2.9.9.4.2 (CXL r3.1) "In addition , the device shall add an appropriate poison creation event to it's internal informational event log, update the event status register and if configured, interrupt the host". So that should do a General Media Event of type 04h - host inject poison. For the qmp interface we should add control of whether we are injecting poison that is intended to trigger an error now (e.g. what would result from a scrub detecting it) or poison for detection later - either by triggering a media scan, or by a host read / write. If it's a scrub poison detection that we are emulating then we should issue an uncorrectable GMR Event record with Memory Event Type of Scrub media or maybe a 00h (Media ECC error) if we think some other reason might cause it and transaction type 05 Media patrol scrub. Note IIRC you can manually inject these records which will result in appropriate events being reported in Linux they just aren't currently hooked up to the QEMU poison injection (qmp or host interface). If you have time to look at filling these more complex flows in that would be great as it would make the qemu side of things easier to use. > According to CXL 3.0 spec Figure 12-5, there are 2 methods > to send the signal: FW-First and OS-First. > My understanding about them is: > - FW-First method: > a. CXL device report POISON to Firmware > b. GHES calls CXL driver handler[1], which will handle the POISON I'm in two minds about how to emulate the firmware first paths. In the short term I'll get some old code I have running again that lets us do general CPER record injection. However, we might want to actually push the record creation into EDK2. Meh - lets do it in qemu first and see how bad it looks. > c. CXL driver handler translates DPA to HPA, construct a mce > instance, then call mce_log() to queue this MCE (? not sure) Yes, the last step is missing currently I think? (I'm loosing track of some of the ras flows). > - OS-First method: > a. CXL device report POISON to OS by MSI > b. CXL driver will handle the POISON > c. same with the c. above > > So, I think: > Firstly, and obviously, we need to add a signal when injecting POISON in > qemu. For example, call `cxl_event_insert()` after injection. Yes - create the appropriate records and add them. However we'll need to enable adding different causes of poison so we know whether to do this or to rely on later queries or not. > > Secondly, implement a method in CXL driver to turn POISON to MCE and > push it into the mce_evt_pool for decode chain to process, then > mce_uc_nb on this chain will finally call memory_failure(). > > And a question: > How to configure the CXL device to choose FW-First or OS-First singal > methods (methods for qemu and bare matel if possible)? There is an _OSC for this. We can hook that up in QEMU but it may be controversial to do it there rather than in EDK2. > > > I don't fully understand the CXL spec yet (it's difficult for me), so > the above ideas may be immature, but I really want to figure out how we > can make CXL & MCE work. I'd really appreciate it if you could help me > on this! > > [1] > https://lore.kernel.org/linux-cxl/20231220-cxl-cper-v5-0-1bb8a4ca2...@intel.com/T/#u Great if you can look at filling in the details in this area. There are still quite a few flows we haven't fully realized in emulation or in the kernel. Jonathan > > > -- > Thanks, > Ruan