Nicholas Piggin <npig...@gmail.com> writes: > Excerpts from Michael Ellerman's message of February 4, 2021 8:15 pm: >> Nicholas Piggin <npig...@gmail.com> writes: >>> This moves the common NMI entry and exit code into the interrupt handler >>> wrappers. >>> >>> This changes the behaviour of soft-NMI (watchdog) and HMI interrupts, and >>> also MCE interrupts on 64e, by adding missing parts of the NMI entry to >>> them. >>> >>> Signed-off-by: Nicholas Piggin <npig...@gmail.com> >>> --- >>> arch/powerpc/include/asm/interrupt.h | 28 ++++++++++++++++++++++ >>> arch/powerpc/kernel/mce.c | 11 --------- >>> arch/powerpc/kernel/traps.c | 35 +++++----------------------- >>> arch/powerpc/kernel/watchdog.c | 10 ++++---- >>> 4 files changed, 38 insertions(+), 46 deletions(-) >> >> This is unhappy when injecting SLB multi-hits: >> >> root@p86-2:~# echo PPC_SLB_MULTIHIT > >> /sys/kernel/debug/provoke-crash/DIRECT >> [ 312.496026][ T1344] kernel BUG at >> arch/powerpc/include/asm/interrupt.h:152! >> [ 312.496037][ T1344] Oops: Exception in kernel mode, sig: 5 [#1] >> [ 312.496045][ T1344] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA >> pSeries > > pseries hash. Blast!
The worst kind. >> 147 static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, >> struct interrupt_nmi_state *state) >> 148 { >> 149 if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64) || >> 150 !firmware_has_feature(FW_FEATURE_LPAR) || >> 151 radix_enabled() || (mfmsr() & MSR_DR)) >> 152 nmi_exit(); >> >> >> So presumably it's: >> >> #define __nmi_exit() \ >> do { \ >> BUG_ON(!in_nmi()); \ > > Yes that would be it, pseries machine check enables MMU half way through > so only one side of this triggers. > > The MSR_DR check is supposed to catch the other NMIs that run with MMU > on (perf, watchdog, etc). Suppose it could test TRAP(regs) explicitly > although I wonder if we should also do this to keep things balanced Yeah I think I like that. I'll give it a test. cheers > diff --git a/arch/powerpc/platforms/pseries/ras.c > b/arch/powerpc/platforms/pseries/ras.c > index 149cec2212e6..f57ca0c570be 100644 > --- a/arch/powerpc/platforms/pseries/ras.c > +++ b/arch/powerpc/platforms/pseries/ras.c > @@ -719,6 +719,7 @@ static int mce_handle_err_virtmode(struct pt_regs *regs, > > static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log > *errp) > { > + unsigned long msr; > struct pseries_errorlog *pseries_log; > struct pseries_mc_errorlog *mce_log = NULL; > int disposition = rtas_error_disposition(errp); > @@ -747,9 +748,12 @@ static int mce_handle_error(struct pt_regs *regs, struct > rtas_error_log *errp) > * SLB multihit is done by now. > */ > out: > - mtmsr(mfmsr() | MSR_IR | MSR_DR); > + msr = mfmsr(); > + mtmsr(msr | MSR_IR | MSR_DR); > disposition = mce_handle_err_virtmode(regs, errp, mce_log, > disposition); > + mtmsr(msr); > + > return disposition; > } >