On 08/07/2013 03:08 PM, Mahesh J Salgaonkar wrote:
> From: Mahesh Salgaonkar <mah...@linux.vnet.ibm.com>
> 
> Move machine check entry point into Linux. So far we were dependent on
> firmware to decode MCE error details and handover the high level info to OS.
> 
> This patch introduces early machine check routine that saves the MCE
> information (srr1, srr0, dar and dsisr) to the emergency stack. We allocate
> stack frame on emergency stack and set the r1 accordingly. This allows us
> to be prepared to take another exception without loosing context. One thing
> to note here that, if we get another machine check while ME bit is off then
> we risk a checkstop. Hence we restrict ourselves to save only MCE information
> and turn the ME bit on.
> 
> This is the code flow:
> 
>               Machine Check Interrupt
>                       |
>                       V
>                  0x200 vector                           ME=0, IR=0, DR=0
>                       |
>                       V
>       +-----------------------------------------------+
>       |machine_check_pSeries_early:                   | ME=0, IR=0, DR=0
>       |       Alloc frame on emergency stack          |
>       |       Save srr1, srr0, dar and dsisr on stack |
>       +-----------------------------------------------+
>                       |
>               (ME=1, IR=0, DR=0, RFID)
>                       |
>                       V
>               machine_check_handle_early                ME=1, IR=0, DR=0
>                       |
>                       V
>       +-----------------------------------------------+
>       |       machine_check_early (r3=pt_regs)        | ME=1, IR=0, DR=0
>       |       Things to do: (in next patches)         |
>       |               Flush SLB for SLB errors        |
>       |               Flush TLB for TLB errors        |
>       |               Decode and save MCE info        |
>       +-----------------------------------------------+
>                       |
>       (Fall through existing exception handler routine.)
>                       |
>                       V
>               machine_check_pSerie                      ME=1, IR=0, DR=0
>                       |
>               (ME=1, IR=1, DR=1, RFID)
>                       |
>                       V
>               machine_check_common                      ME=1, IR=1, DR=1
>                       .
>                       .
>                       .
> 
> 
> Signed-off-by: Mahesh Salgaonkar <mah...@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/exception-64s.h |   43 ++++++++++++++++++++++++++
>  arch/powerpc/kernel/exceptions-64s.S     |   50 
> +++++++++++++++++++++++++++++-
>  arch/powerpc/kernel/traps.c              |   12 +++++++
>  3 files changed, 104 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/exception-64s.h 
> b/arch/powerpc/include/asm/exception-64s.h
> index 2386d40..c5d2cbc 100644
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -174,6 +174,49 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
>  #define EXCEPTION_PROLOG_1(area, extra, vec)                         \
>       __EXCEPTION_PROLOG_1(area, extra, vec)
> 
> +/*
> + * Register contents:
> + * R12               = interrupt vector
> + * R13               = PACA
> + * R9                = CR
> + * R11 & R12 is saved on PACA_EXMC
> + *
> + * Swicth to emergency stack and handle re-entrancy (though we currently
> + * don't test for overflow). Save MCE registers srr1, srr0, dar and
> + * dsisr and then turn the ME bit on.
> + */
> +#define __EARLY_MACHINE_CHECK_HANDLER(area, label)                   \
> +     /* Check if we are laready using emergency stack. */            \
> +     ld      r10,PACAEMERGSP(r13);                                   \
> +     subi    r10,r10,THREAD_SIZE;                                    \
> +     rldicr  r10,r10,0,(63 - THREAD_SHIFT);                          \
> +     rldicr  r11,r1,0,(63 - THREAD_SHIFT);                           \
> +     cmpd    r10,r11;        /* Are we using emergency stack? */     \
> +     mr      r11,r1;                 /* Save current stack pointer */\
> +     beq     0f;                                                     \
> +     ld      r1,PACAEMERGSP(r13);    /* Use emergency stack */       \
> +0:   subi    r1,r1,INT_FRAME_SIZE;   /* alloc stack frame */         \
> +     std     r11,GPR1(r1);                                           \
> +     std     r11,0(r1);              /* make stack chain pointer */  \
> +     mfspr   r11,SPRN_SRR0;          /* Save SRR0 */                 \
> +     std     r11,_NIP(r1);                                           \
> +     mfspr   r11,SPRN_SRR1;          /* Save SRR1 */                 \
> +     std     r11,_MSR(r1);                                           \
> +     mfspr   r11,SPRN_DAR;           /* Save DAR */                  \
> +     std     r11,_DAR(r1);                                           \
> +     mfspr   r11,SPRN_DSISR;         /* Save DSISR */                \
> +     std     r11,_DSISR(r1);                                         \
> +     mfmsr   r11;                    /* get MSR value */             \
> +     ori     r11,r11,MSR_ME;         /* turn on ME bit */            \

You need to mention here the fact that we are vulnerable to a core check
stop possibility if we get another machine check exception till we set
the ME bit ON (from the occurrence of the interrupt).

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to