On Tue, 21 Feb 2023 23:15:49 +0100 Philippe Mathieu-Daudé <phi...@linaro.org> wrote:
> Hi Jonathan, > > On 21/2/23 16:21, Jonathan Cameron wrote: > > CXL uses PCI AER Internal errors to signal to the host that an error has > > occurred. The host can then read more detailed status from the CXL RAS > > capability. > > > > For uncorrectable errors: support multiple injection in one operation > > as this is needed to reliably test multiple header logging support in an > > OS. The equivalent feature doesn't exist for correctable errors, so only > > one error need be injected at a time. > > > > Note: > > - Header content needs to be manually specified in a fashion that > > matches the specification for what can be in the header for each > > error type. > > > > Injection via QMP: > > { "execute": "qmp_capabilities" } > > ... > > { "execute": "cxl-inject-uncorrectable-errors", > > "arguments": { > > "path": "/machine/peripheral/cxl-pmem0", > > "errors": [ > > { > > "type": "cache-address-parity", > > "header": [ 3, 4] > > }, > > { > > "type": "cache-data-parity", > > "header": > > [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31] > > }, > > { > > "type": "internal", > > "header": [ 1, 2, 4] > > } > > ] > > }} > > ... > > { "execute": "cxl-inject-correctable-error", > > "arguments": { > > "path": "/machine/peripheral/cxl-pmem0", > > "type": "physical" > > } } > > > > Signed-off-by: Jonathan Cameron <jonathan.came...@huawei.com> Hi Philippe, Thanks for your review. One question inline. > > +# > > +# Type of uncorrectable CXL error to inject. These errors are reported via > > +# an AER uncorrectable internal error with additional information logged at > > +# the CXL device. > > +# > > +# @cache-data-parity: Data error such as data parity or data ECC error > > CXL.cache > > +# @cache-address-parity: Address parity or other errors associated with the > > +# address field on CXL.cache > > +# @cache-be-parity: Byte enable parity or other byte enable errors on > > CXL.cache > > +# @cache-data-ecc: ECC error on CXL.cache > > +# @mem-data-parity: Data error such as data parity or data ECC error on > > CXL.mem > > +# @mem-address-parity: Address parity or other errors associated with the > > +# address field on CXL.mem > > +# @mem-be-parity: Byte enable parity or other byte enable errors on > > CXL.mem. > > +# @mem-data-ecc: Data ECC error on CXL.mem. > > +# @reinit-threshold: REINIT threshold hit. > > +# @rsvd-encoding: Received unrecognized encoding. > > +# @poison-received: Received poison from the peer. > > +# @receiver-overflow: Buffer overflows (first 3 bits of header log > > indicate which) > > +# @internal: Component specific error > > +# @cxl-ide-tx: Integrity and data encryption tx error. > > +# @cxl-ide-rx: Integrity and data encryption rx error. > > +## > > + > > +{ 'enum': 'CxlUncorErrorType', > > Doesn't these need > > 'if': 'CONFIG_CXL_MEM_DEVICE', > > ? If I make this change I get a bunch of ./qapi/qapi-types-cxl.h:18:13: error: attempt to use poisoned "CONFIG_CXL_MEM_DEVICE" 18 | #if defined(CONFIG_CXL_MEM_DEVICE) It's a target specific define (I think) as built alongside PCI_EXPRESS Only CXL_ACPI is specifically included by x86 and arm64 (out of tree) To be honest though I don't fully understand the QEMU build system so the reason for the error might be wrong. > > > + 'data': ['cache-data-parity', > > + 'cache-address-parity', > > + 'cache-be-parity', > > + 'cache-data-ecc', > > + 'mem-data-parity', > > + 'mem-address-parity', > > + 'mem-be-parity', > > + 'mem-data-ecc', > > + 'reinit-threshold', > > + 'rsvd-encoding', > > + 'poison-received', > > + 'receiver-overflow', > > + 'internal', > > + 'cxl-ide-tx', > > + 'cxl-ide-rx' > > + ] > > + }