On Tue, Apr 29, 2025 at 03:01:42PM -0500, Ben Cheatham wrote: > > On 4/28/25 9:35 PM, Alison Schofield wrote: > > On Thu, Apr 24, 2025 at 04:23:55PM -0500, Ben Cheatham wrote: > >> This series adds support for injecting CXL protocol (CXL.cache/mem) > >> errors[1] into CXL RCH Downstream ports and VH root ports[2] and > >> poison into CXL memory devices through the CXL debugfs. Errors are > >> injected using a new 'inject-error' command, while errors are reported > >> using a new cxl-list "-N"/"--injectable-errors" option. > >> > >> The 'inject-error' command and "-N" option of cxl-list both require > >> access to the CXL driver's debugfs. Because the debugfs doesn't have a > >> required mount point, a "--debugfs" option is added to both cxl-list and > >> cxl-inject-error to specify the path to the debugfs if it isn't mounted > >> to the usual place (/sys/kernel/debug). > >> > >> The documentation for the new cxl-inject-error command shows both usage > >> and the possible device/error types, as well as how to retrieve them > >> using cxl-list. The documentation for cxl-list has also been updated to > >> show the usage of the new injectable errors and debugfs options. > >> > >> [1]: ACPI v6.5 spec, section 18.6.4 > >> [2]: ACPI v6.5 spec, table 18.31 > > > > Hi Ben, > > > > Junkyeok Im posted a set for inject & clear poison back in 2023.[1] It > > went through one round of review but was a bit ahead of it's time as we > > were still working out the presentation of media-errors in the trigger > > poison patch set. I'll 'cc them here in case they have interest and can > > help review thi set. > > Thanks for pointing this out. I forgot to look for an existing set before > implementing it myself, sorry about that :/. > > I'd be willing to drop the poison support from this set and use Junhyeok's > instead, integrate it into this one, or leave it as-is.
I should have recalled at the RFC time. Anyway, compare and contrast and select the best path forward. > > > > > How come you're not interested in implementing clear-poison? > > It is implemented, it's a flag ("--clear") for the inject-error command. I > forgot > to mention it in the cover letter, I can add it in v2. Ah, I haven't reviewed yet to see that. I'm going to ask for that to be its own command. We may get into some naming brouhaha. You are using the word 'error' for multiple types of errors and we used 'media-error' specifically for device poison. I'll put more thought into it when I review in detail. > > Thanks, > Ben