The current touch_nmi_watchdog() function in /kernel/watchdog.c does not always catch all cases when a processor is spinning in the nmi handler inside either KGDB, KDB, or MDB, in particular, the case where a processor is being held by a debugger inside an int1 handler.
The hrtimer_interrupts_saved count can still end up matching the hrtime value in some cases, resulting in the hard lockup detector tagging processors inside a debugger and executing a panic. The patch below corrects this problem. I did not add this to the touch_nmi_function directly becuase of possible affects on timing issues since the function is widely used by drivers and modules. I have tested this patch and it fixes the problem for kernel debuggers stopping errant hard lockup events when processors are spinning inside the debugger. Signed-off-by: Jeff Merkey <linux....@gmail.com> --- kernel/watchdog.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 18f34cf..b682aab 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -283,6 +283,13 @@ static bool is_hardlockup(void) __this_cpu_write(hrtimer_interrupts_saved, hrint); return false; } + +void touch_hardlockup_watchdog(void) +{ + __this_cpu_write(hrtimer_interrupts_saved, 0); +} +EXPORT_SYMBOL_GPL(touch_hardlockup_watchdog); + #endif static int is_softlockup(unsigned long touch_ts) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/