On Fri, Sep 22, 2017 at 1:32 PM, Michael Neuling <mi...@neuling.org> wrote: > On POWER9 DD2.1 and below, it's possible for a paste instruction to > cause a Machine Check Exception (MCE) where only DSISR bit 33 is > set. This will result in the MCE handler seeing an unknown event, > which triggers linux to crash. > > We change this by detecting unknown events caused by load/stores in > the MCE handler and marking them as handled so that we no longer > crash. > > An MCE that occurs like this is spurious, so we don't need to do > anything in terms of servicing it. If there is something that needs to > be serviced, the CPU will raise the MCE again with the correct DSISR > so that it can be serviced properly. > > Signed-off-by: Michael Neuling <mi...@neuling.org> > Reviewed-by: Nicholas Piggin <npig...@gmail.com > -- > v3: Simplification and SRR1 check suggestions from Nick > v2: update commit message based on Balbir's comments > --- > arch/powerpc/kernel/mce_power.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c > index b76ca198e0..e423cf0e43 100644 > --- a/arch/powerpc/kernel/mce_power.c > +++ b/arch/powerpc/kernel/mce_power.c > @@ -624,5 +624,15 @@ long __machine_check_early_realmode_p8(struct pt_regs > *regs) > > long __machine_check_early_realmode_p9(struct pt_regs *regs) > { > + /* > + * On POWER9 DD2.1 and below, it's possible to get machine > + * check caused by a paste instruction where only DSISR bit 33 > + * is set. This will result in the MCE handler seeing an > + * unknown event and us crashing. Change this to mark as > + * handled. > + */ > + if (SRR1_MC_LOADSTORE(regs->msr) && regs->dsisr == 0x40000000) > + return 1; > +
Acked-by: Balbir SIngh <bsinghar...@gmail.com>