Code reading uncovered a potential deadlock on the global cpuset semaphore, cpuset_sem.
==> This patch is only useful in the 2.6.13-stable series. (It's harmless, and useless, in the pre 2.6.14 fork) The pre-2.6.14 fork has already diverged, with an additional patch that further aggrevated this problem, and a more thorough overhaul of the cpuset locking, to fix the problems. All code paths in kernel/cpuset.c (2.6.13 or earlier) that first grab cpuset_sem and then allocate memory _must_ call the routine 'refresh_mems()', after getting cpuset_sem, before any possible allocation. If this refresh_mems() call is not done, then there is a risk that one of the cpuset_zone_allowed() calls made from within the page allocator (__alloc_pages) will find that the mems_generation of the current task doesn't match that of its cpuset, causing it to try to grab cpuset_sem. Since it already held cpuset_sem, this deadlocks that task, and any subsequent task wanting cpuset_sem. ==> The code paths leading to the kmalloc in check_for_release(), from cpuset_exit, cpuset_rmdir and attach_task (for the detached cpuset), fail to invoke refresh_mems() as required. The fix is easy enough - add the requisite refresh_mems() call. Unless someone is rapidly creating, modifying and destroying cpusets, they are unlikely to have any chance of encountering this deadlock. And even then, it is apparently difficult to do so. In the case we got here from cpuset_exit(), we have already torn down the tasks connection to this cpuset and current->cpuset is NULL. Don't call refresh_mems() in that case - it oops the kernel. Signed-off-by: Paul Jackson <[EMAIL PROTECTED]> Index: linux-2.6.13-mem_exclusive_oom/kernel/cpuset.c =================================================================== --- linux-2.6.13-mem_exclusive_oom.orig/kernel/cpuset.c +++ linux-2.6.13-mem_exclusive_oom/kernel/cpuset.c @@ -458,7 +458,10 @@ static void check_for_release(struct cpu if (notify_on_release(cs) && atomic_read(&cs->count) == 0 && list_empty(&cs->children)) { char *buf; + static void refresh_mems(void); + if (current->cpuset) + refresh_mems(); buf = kmalloc(PAGE_SIZE, GFP_KERNEL); if (!buf) return; -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROTECTED]> 1.650.933.1373 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/