On Wed, Sep 05, 2018 at 09:33:34PM -0400, Steven Rostedt wrote:
>   do_idle {
> 
>     [interrupts enabled]
> 
>     <interrupt> [interrupts disabled]
>       TRACE_IRQS_OFF [lockdep says irqs off]
>       [...]
>       TRACE_IRQS_IRET
>           test if pt_regs say return to interrupts enabled [yes]
>           TRACE_IRQS_ON [lockdep says irqs are on]
> 
>           <nmi>
>               nmi_enter() {
>                   printk_nmi_enter() [traced by ftrace]
>                   [ hit ftrace breakpoint ]
>                   <breakpoint exception>
>                       TRACE_IRQS_OFF [lockdep says irqs off]
>                       [...]
>                       TRACE_IRQS_IRET [return from breakpoint]
>                          test if pt_regs say interrupts enabled [no]
>                          [iret back to interrupt]
>          [iret back to code]
> 
>     tick_nohz_idle_enter() {
> 
>       lockdep_assert_irqs_enabled() [lockdep say no!]

Isn't the problem that we muck with the IRQ state from NMI context? We
shouldn't be doing that.

The thing is, since we trace the IRQ state from within IRQ-disable,
since that's the only IRQ-safe option, it is very much not NMI-safe.

Your patch might avoid the symptom, but I don't think it cures the
fundamental problem.

Reply via email to