On Fri, 2 Apr 2021 23:09:09 -0400
Waiman Long <long...@redhat.com> wrote:

> The main problem with sched_debug_lock is that under certain 
> circumstances, a lock waiter may wait a long time to acquire the lock 
> (in seconds). We can't insert touch_nmi_watchdog() while the cpu is 
> waiting for the spinlock.

The problem I have with the patch is that it seems to be a hack (as it
doesn't fix the issue in all cases). Since sched_debug_lock is
"special", perhaps we can add wrappers to take it, and instead of doing
the spin_lock_irqsave(), do a trylock loop. Add lockdep annotation to
tell lockdep that this is not a try lock (so that it can still detect
deadlocks).

Then have the strategically placed touch_nmi_watchdog() also increment
a counter. Then in that trylock loop, if it sees the counter get
incremented, it knows that forward progress is being made by the lock
holder, and it too can call touch_nmi_watchdog().

-- Steve

Reply via email to