There appears to be a deadlock in the hrtimer code. Specifically,
clock_was_set() calls an IPI with wait=1, from softirq context.

Waiting for IPIs to complete in irq context can lead to a deadlock,
because the current code (that was interrupted) might be holding some
kind of lock, that another CPU is waiting for with spin_lock_irq or
similar.

In other words, the current CPU may need to release a resource, before
the IPI can be handled by one of the destination CPUs.

To my untrained eye, it does not look like this patch introduces a
new bug to the timer code, but that is hard to ascertain with the
timer code. so I am posting this as an RFC for the timer gods to hurt
their brains on :)

This bug was introduced by 54cdfdb4 in early 2007 (the original
hrtimer code patch).

Not-yet-signed-off-by: Rik van Riel <r...@redhat.com>
Reported-by: Mateusz Guzik <mgu...@redhat.com>
Cc: Benjamin Herrenschmidt <b...@kernel.crashing.org>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Prarit Bhargava <pra...@redhat.com>
Cc: Frederic Weisbecker <fweis...@gmail.com>
Cc: Clark Williams <willi...@redhat.com>
---
 kernel/hrtimer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 0909436..19145ec 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -771,7 +771,7 @@ void clock_was_set(void)
 {
 #ifdef CONFIG_HIGH_RES_TIMERS
        /* Retrigger the CPU local events everywhere */
-       on_each_cpu(retrigger_next_event, NULL, 1);
+       on_each_cpu(retrigger_next_event, NULL, 0);
 #endif
        timerfd_clock_was_set();
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to