[EMAIL PROTECTED] (Olof Johansson) wrote: > > Hi, > > Consider a small testcase that spawns off two threads, either thread > doing a loop of: > > buf = mmap /dev/zero MAP_SHARED for 0x100000 bytes > call sys_futex (buf+page, FUTEX_WAIT, 1, NULL, NULL) for each page in > said mmap > munmap(buf) > repeat > > This will quickly lock up, since the futex_wait code dows a > down_read(mmap_sem), then a get_user(). > > The do_page_fault code on ppc64 (as well as other architectures) needs > to take the same semaphore for reading. This is all good until the > second thread comes into play: Its mmap call tries to take the same > semaphore for writing which causes in the do_page_fault down_read() > to get stuck. Classic deadlock.
Yup. Jamie says that the futex code _has_ to hold mmap_sem across the get_user(). I forget (but could probably locate) the details. > > One attempt to fix this is included below. It works, but I'm not entirely > happy with the fact that it's a bit messy solution. If anyone has a > better idea for how to solve it I'd be all ears. It's fairly sane. Style-wise I'd be inclined to turn this: down_read(¤t->mm->mmap_sem); while (!check_user_page_readable(current->mm, uaddr1)) { up_read(¤t->mm->mmap_sem); /* Fault in the page through get_user() but discard result */ if (get_user(curval, (int __user *)uaddr1) != 0) return -EFAULT; down_read(¤t->mm->mmap_sem); } into a standalone helper function. > --- linux-2.5.orig/mm/mempolicy.c 2005-02-04 00:27:40.000000000 -0600 > +++ linux-2.5/mm/mempolicy.c 2005-02-21 16:43:08.000000000 -0600 > @@ -486,6 +486,7 @@ > struct mm_struct *mm = current->mm; > struct vm_area_struct *vma = NULL; > struct mempolicy *pol = current->mempolicy; > + DECLARE_BITMAP(nodes, MAX_NUMNODES); > > if (flags & ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR)) > return -EINVAL; > @@ -524,16 +525,21 @@ > } else > pval = pol->policy; > > + if (nmask) > + get_zonemask(pol, nodes); > + > + if (vma) { > + up_read(¤t->mm->mmap_sem); > + vma = NULL; > + } > + OK. > err = -EFAULT; > if (policy && put_user(pval, policy)) > goto out; > > err = 0; > - if (nmask) { > - DECLARE_BITMAP(nodes, MAX_NUMNODES); > - get_zonemask(pol, nodes); > + if (nmask) > err = copy_nodes_to_user(nmask, maxnode, nodes, sizeof(nodes)); > - } > > out: > if (vma) I don't think we need to hold mmap_sem while running get_zonemask(). `pol' is a copy of current->mempolicy, and it won't be going away. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/