On Tue, 16 Oct 2007, Gautham R Shenoy wrote: > > Patch 1/4: Implements the core refcount + waitqueue model. > Patch 2/4: Replaces all the lock_cpu_hotplug/unlock_cpu_hotplug instances > with get_online_cpus()/put_online_cpus() > Patch 3/4: Replaces the subsystem mutexes (we do have three of them now, > in sched.c, slab.c and workqueue.c) with get_online_cpus, > put_online_cpus. > Patch 4/4: Eliminates the CPU_DEAD and CPU_UP_CANCELLED event handling > from workqueue.c > > The patch series has survived an overnight test with kernbench on i386. > and has been tested with Paul Mckenney's latest preemptible rcu code. > > Awaiting thy feedback!
Well, afaik, the patch series is fairly clean, and I'm obviously perfectly happy with the approach, so I have no objections. But it looks buggy. This: +static void cpu_hotplug_begin(void) +{ + mutex_lock(&cpu_hotplug.lock); + cpu_hotplug.active_writer = current; + while (cpu_hotplug.refcount) { + mutex_unlock(&cpu_hotplug.lock); + wait_for_completion(&cpu_hotplug.readers_done); + mutex_lock(&cpu_hotplug.lock); + } + +} drops the cpu_hotplug.lock, which - as far as I can see - means that another process can come in and do the same, and mess up the "active_writer" thing. The oerson that actually *gets* the lock may not be the same one that has "active_writer" set to itself. No? Am I missing something. So I think this needs (a) more people looking at it (I think I found a bug, who knows if there are more subtle ones lurking) and (b) lots of testing. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/