On Wed,  2 May 2018 23:07:26 +1000
Michael Ellerman <m...@ellerman.id.au> wrote:

> A CPU that gets stuck with interrupts hard disable can be difficult to
> debug, as on some platforms we have no way to interrupt the CPU to
> find out what it's doing.
> 
> A stop-gap is to have the CPU save it's stack pointer (r1) in its paca
> when it hard disables interrupts. That way if we can't interrupt it,
> we can at least trace the stack based on where it last disabled
> interrupts.
> 
> In some cases that will be total junk, but the stack trace code should
> handle that. In the simple case of a CPU that disable interrupts and
> then gets stuck in a loop, the stack trace should be informative.
> 
> We could clear the saved stack pointer when we enable interrupts, but
> that loses information which could be useful if we have nothing else
> to go on.
> 
> Signed-off-by: Michael Ellerman <m...@ellerman.id.au>
> ---
>  arch/powerpc/include/asm/hw_irq.h    | 6 +++++-
>  arch/powerpc/include/asm/paca.h      | 2 +-
>  arch/powerpc/kernel/exceptions-64s.S | 1 +
>  arch/powerpc/xmon/xmon.c             | 2 ++
>  4 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/hw_irq.h 
> b/arch/powerpc/include/asm/hw_irq.h
> index 855e17d158b1..35cb37be61fe 100644
> --- a/arch/powerpc/include/asm/hw_irq.h
> +++ b/arch/powerpc/include/asm/hw_irq.h
> @@ -237,8 +237,12 @@ static inline bool arch_irqs_disabled(void)
>       __hard_irq_disable();                                           \
>       flags = irq_soft_mask_set_return(IRQS_ALL_DISABLED);            \
>       local_paca->irq_happened |= PACA_IRQ_HARD_DIS;                  \
> -     if (!arch_irqs_disabled_flags(flags))                           \
> +     if (!arch_irqs_disabled_flags(flags)) {                         \
> +             asm ("stdx %%r1, 0, %1 ;"                               \
> +                  : "=m" (local_paca->saved_r1)                      \
> +                  : "b" (&local_paca->saved_r1));                    \
>               trace_hardirqs_off();                                   \
> +     }       

This is pretty neat, it would be good to have something that's not so
destructive as the NMI IPI.

Thanks,
Nick

Reply via email to