The series introduces complete support for RTAS-based hardware error
injection on pseries platforms. The implementation replaces the legacy
MMIO-based approach with a full PAPR-compliant workflow built around the
RTAS services:
        ibm,open-errinjct
        ibm,errinjct
        ibm,close-errinjct

The new pseries_eeh_err_inject() interface enables controlled injection
of synthetic PCI, memory, and cache/TLB faults for platform validation,
EEH testing, firmware diagnostics, and tooling (e.g., bpftrace-based
tracing).

Current testing scope:
At this stage, the feature can be triggered only from VFIO-passthrough
devices assigned to a guest, and from userspace-exposed VFIO devices
using the VFIO_EEH_PE_INJECT_ERR ioctl.

Key Highlights

Dynamic acquisition of all required RTAS tokens and correct
open/errinjct/close session handling as defined by PAPR. A 1KB
naturally-aligned, zero-initialized RTAS working buffer is populated
exactly per PAPR buffer definitions.

Support for a wide range of error types:

0x03 - recovered-special-event
0x04 - corrupted-page
0x07 - ioa-bus-error (32-bit)
0x0F - ioa-bus-error-64 (64-bit)
0x09 - corrupted-dcache-start
0x0A - corrupted-dcache-end
0x0B - corrupted-icache-start
0x0C - corrupted-icache-end
0x0D - corrupted-tlb-start
0x0E - corrupted-tlb-end

All RTAS parameters use proper big-endian formatting (cpu_to_be32()).
Robust status-handling, printk-based diagnostics, and thorough
validation for invalid or unsupported conditions.

Error-specific buffer population logic is factored into helpers for
clarity and maintainability.

Fully tested on PowerVM with firmware that supports RTAS error
injection, along with the companion QEMU support posted here:
https://lore.kernel.org/qemu-devel/[email protected]/

Signed-off-by: Narayana Murty N <[email protected]>
---
Change Log:
v1 -> v2:
 * Addressed all review comments from Sourabh Jain
   - Removed unnecessary empty line in rtas_call()
   - Enhanced comment to explain PAPR specification requirements
   - Corrected misleading comment about output handling
   - Improved else block comment for better code clarity
 * Fixed kernel test robot warnings
   - Fixed kernel-doc warning for __maybe_unused parameter
   - Confirmed sparse warnings are false positives (correct endianness handling)
 * Added PowerNV platform abstraction layer (new Patch 5)
   - Maps EEH error types to OPAL-specific types
   - Simplifies type handling by direct variable update
 * Improved code comments and documentation throughout
 * Added Reported-by tags for kernel test robot findings
 * Split into logical 5-patch series for better review

RFC -> v1: 
https://lore.kernel.org/all/[email protected]/
 * Initial 4-patch series
 * Fixed PAPR ibm,open-errinjct output format (token,status order)
 * Added pr_fmt handling for EEH subsystem compatibility
 * Implemented comprehensive validation helpers

RFC: https://lore.kernel.org/all/[email protected]/
 * Initial RFC implementation

Narayana Murty N (5):
  powerpc/rtas: Handle special return format for
    RTAS_FN_IBM_OPEN_ERRINJCT
  powerpc/pseries: Add RTAS error injection buffer infrastructure
  powerpc/pseries: Add RTAS error injection validation helpers
  powerpc/pseries: Implement RTAS error injection via
    pseries_eeh_err_inject
  powerpc/powernv: Map EEH error types to OPAL error injection types

 arch/powerpc/include/asm/rtas.h              |  21 +
 arch/powerpc/include/uapi/asm/eeh.h          |  18 +
 arch/powerpc/kernel/rtas.c                   |  59 ++-
 arch/powerpc/platforms/powernv/eeh-powernv.c |  11 +-
 arch/powerpc/platforms/pseries/eeh_pseries.c | 423 ++++++++++++++++++-
 5 files changed, 505 insertions(+), 27 deletions(-)

-- 
2.54.0


Reply via email to