> > You would only need a single one per system however, not one per CPU. > > RCU already tracks all the CPUs, all we need is a single NMI watchdog > > that makes sure RCU itself does not get stuck. > > > > So we just have to find a single watchdog somewhere that can trigger > > NMI. > > But then you have to IPI broadcast the NMI, which is less than ideal.
Only when the watchdog times out to print the backtraces. > > RCU doesn't have that problem because the quiescent state is a global > thing. CPU progress, which is what the NMI watchdog tests, is very much > per logical CPU though. RCU already has a CPU stall detector. It should work (and usually triggers before the NMI watchdog in my experience unless the whole system is dead) -Andi