Hi, Please find the patch set that performs the machine check handling inside linux host. The design is to be able to handle re-entrancy so that we do not clobber the machine check information during nested machine check interrupt.
The patch 2 introduces separate emergency stack in paca structure exclusively for machine check exception handling. Patch 3 implements the logic to save the raw MCE info onto the emergency stack and prepares to take another exception. Patch 4 and 5 adds CPU-side hooks for early machine check handler and TLB flush. The patch 6 and 7 is responsible to detect SLB/TLB errors and flush them off in the real mode. The patch 9 implements the logic to decode and save high level MCE information to per cpu buffer without clobbering. The patch 10 adds the basic error handling to the high level C code with MMU on. I have tested SLB multihit scenario on powernv. Please review and let me know your comments. Changes in v2: - Moved early machine check handling code under CPU_FTR_HVMODE section. This makes sure that the early machine check handler will get executed only in hypervisor kernel. - Add dedicated emergency stack for machine check so that we don't end up disturbing others who use same emergency stack. - Fixed the machine check early handle where it used to assume that r1 always contains the valid stack pointer. - Fixed an issue where per-cpu mce_nest_count variable underflows when kvm fails to handle MC error and exit the guest. - Fixed the code to restore r13 before exiting early handler. Thanks, -Mahesh. --- Mahesh Salgaonkar (10): powerpc/book3s: Split the common exception prolog logic into two section. powerpc/book3s: Introduce exclusive emergency stack for machine check exception. powerpc/book3s: handle machine check in Linux host. powerpc/book3s: Introduce a early machine check hook in cpu_spec. powerpc/book3s: Add flush_tlb operation in cpu_spec. powerpc/book3s: Flush SLB/TLBs if we get SLB/TLB machine check errors on power7. powerpc/book3s: Flush SLB/TLBs if we get SLB/TLB machine check errors on power8. powerpc/book3s: Decode and save machine check event. powerpc/powernv: Remove machine check handling in OPAL. powerpc/powernv: Machine check exception handling. arch/powerpc/include/asm/bitops.h | 5 + arch/powerpc/include/asm/cputable.h | 12 + arch/powerpc/include/asm/exception-64s.h | 67 ++++--- arch/powerpc/include/asm/mce.h | 195 ++++++++++++++++++++ arch/powerpc/include/asm/paca.h | 9 + arch/powerpc/kernel/Makefile | 1 arch/powerpc/kernel/asm-offsets.c | 4 arch/powerpc/kernel/cpu_setup_power.S | 38 +++- arch/powerpc/kernel/cputable.c | 16 ++ arch/powerpc/kernel/exceptions-64s.S | 108 +++++++++++ arch/powerpc/kernel/mce.c | 191 ++++++++++++++++++++ arch/powerpc/kernel/mce_power.c | 287 ++++++++++++++++++++++++++++++ arch/powerpc/kernel/setup_64.c | 8 + arch/powerpc/kernel/traps.c | 15 ++ arch/powerpc/kvm/book3s_hv_ras.c | 50 +++-- arch/powerpc/platforms/powernv/opal.c | 84 ++++++--- arch/powerpc/xmon/xmon.c | 2 17 files changed, 998 insertions(+), 94 deletions(-) create mode 100644 arch/powerpc/include/asm/mce.h create mode 100644 arch/powerpc/kernel/mce.c create mode 100644 arch/powerpc/kernel/mce_power.c -- -Mahesh _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev