Hi, I'm about six months late here(!), but I noticed this bug in arch/x86_64/kernel/smp.c while preparing another related patch today and then found this thread during Googling ...
On 2/9/07, Heiko Carstens <[EMAIL PROTECTED]> wrote:
On i386/x86_64 smp_call_function_single() takes call_lock with spin_lock_bh(). To me this would imply that it is legal to call smp_call_function_single() from softirq context. It's not since smp_call_function() takes call_lock with just spin_lock(). We can easily deadlock: -> [process context] -> smp_call_function() -> spin_lock(&call_lock) -> IRQ -> do_softirq -> tasklet -> [softirq context] -> smp_call_function_single() -> spin_lock_bh(&call_lock) -> dead
You're absolutely right, and this bug still exists in the latest -git.
So either all spin_lock_bh's should be converted to spin_lock, which would limit smp_call_function()/smp_call_function_single() to process context & irqs enabled. Or the spin_lock's could be converted to spin_lock_bh which would make it possible to call these two functions even if in softirq context. AFAICS this should be safe.
Actually, I agree with David and Andi here: On 2/9/07, David Miller <[EMAIL PROTECTED]> wrote:
I think it's logically simpler if we disallow smp_call_function*() from any kind of asynchronous context. But I'm sure your driver has a true need for this for some reason.
and On 2/9/07, Andi Kleen <[EMAIL PROTECTED]> wrote:
I'm not so sure. Perhaps drop _bh in both and stick a WARN_ON_ONCE in to catch the cases?
Replacing the _bh variants and making smp_call_function{_single} illegal from all contexts but process is fine for x86_64, as we don't really have any driver that needs to use this from softirq context in the x86_64 tree. This means it becomes dissimilar to s390, but similar to powerpc, mips, alpha, sparc64 semantics. I'll prepare and submit a patch for the same, shortly. As for: On 2/9/07, Heiko Carstens <[EMAIL PROTECTED]> wrote:
Another thing that comes into my mind is smp_call_function together with cpu hotplug. Who is responsible that preemption and with that cpu hotplug is disabled? Is it the caller or smp_call_function itself? If it's smp_call_function then s390 would be broken, since then we would have int cpus = num_online_cpus()-1; in preemptible context... I agree: what a mess :)
and On 2/9/07, Jan Glauber <[EMAIL PROTECTED]> wrote:
If preemption must be disabled before smp_call_function() we should have the same semantics for all smp_call_function_* variants.
I don't see any CPU hotplug / preemption disabling issues here. Note that both smp_call_function() and smp_call_function_single() on x86_64 acquire the call_lock spinlock before using cpu_online_map via num_online_cpus(). And spin_lock() does preempt_disable() on both SMP and !SMP, so we're safe. [ But we're not explicitly disabling preemption and depending on spin_lock() instead, so a comment would be in order? ] Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/