"Adam Litke" <[EMAIL PROTECTED]> writes: > On 6/12/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote: >> Adam Litke <[EMAIL PROTECTED]> writes: >> >> > Here's another breakage as a result of shared memory stacked files :( >> > >> > The NUMA policy for a VMA is determined by checking the following (in the > order >> > given):
Where that doesn't appear to be what the current code does. If there is a VMA it doesn't appear that we check task->mempolicy. >> > 1) vma->vm_ops->get_policy() (if defined) >> > 2) vma->vm_policy (if defined) >> > 3) task->mempolicy (if defined) >> > 4) Fall back to default_policy >> > >> > By switching to stacked files for shared memory, get_policy() is now always > set >> > to shm_get_policy which is a wrapper function. This causes us to stop at > step >> > 1, which yields NULL for hugetlb instead of task->mempolicy which was the >> > previous (and correct) result. >> > >> > This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for > the >> > wrapped vm_ops. Andi and Christoph, does this look right to you? >> >> I'm confused. >> >> I agree that the behavior you describe is correct. >> However I only see two code paths were get_policy is called and >> both of them take a NULL result and change it to task->mempolicy: > > The coffee hasn't quite absorbed yet, but don't those two code paths > take a NULL result from get_policy() and turn it into default_policy, > not task->mempolicy? True, a NULL is turned into default_policy. However the basic confusion remains. I don't see how we ever return task->mempolicy on those paths if we have a vma, and those appear to be the only call sites of get_policy. >> From mm/mempolicy.c >> >> > long do_get_mempolicy(int *policy, nodemask_t *nmask, >> > unsigned long addr, unsigned long flags) >> > { >> > int err; >> > struct mm_struct *mm = current->mm; >> > struct vm_area_struct *vma = NULL; >> > struct mempolicy *pol = current->mempolicy; >> > >> > cpuset_update_task_memory_state(); >> > if (flags & ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR)) >> > return -EINVAL; >> > if (flags & MPOL_F_ADDR) { >> > down_read(&mm->mmap_sem); >> > vma = find_vma_intersection(mm, addr, addr+1); >> > if (!vma) { >> > up_read(&mm->mmap_sem); >> > return -EFAULT; >> > } >> > if (vma->vm_ops && vma->vm_ops->get_policy) >> > pol = vma->vm_ops->get_policy(vma, addr); >> > else >> > pol = vma->vm_policy; >> > } else if (addr) >> > return -EINVAL; >> > >> > if (!pol) >> > pol = &default_policy; >> >> >> >> > /* Return effective policy for a VMA */ >> > static struct mempolicy * get_vma_policy(struct task_struct *task, >> > struct vm_area_struct *vma, unsigned long addr) >> > { >> > struct mempolicy *pol = task->mempolicy; >> > >> > if (vma) { >> > if (vma->vm_ops && vma->vm_ops->get_policy) >> > pol = vma->vm_ops->get_policy(vma, addr); >> > else if (vma->vm_policy && >> > vma->vm_policy->policy != MPOL_DEFAULT) >> > pol = vma->vm_policy; >> > } >> > if (!pol) >> > pol = &default_policy; >> > return pol; >> > } >> >> >> Does this perhaps need to be: >> > Signed-off-by: Adam Litke <[EMAIL PROTECTED]> >> > >> > diff --git a/ipc/shm.c b/ipc/shm.c >> > index 4fefbad..8d2672d 100644 >> > --- a/ipc/shm.c >> > +++ b/ipc/shm.c >> > @@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct >> > *vma, unsigned long addr) >> >> + pol = NULL; >> > >> > if (sfd->vm_ops->get_policy) >> > pol = sfd->vm_ops->get_policy(vma, addr); >> > - else >> > + else if (vma->vm_policy && vma->vm_policy->policy != MPOL_DEFAULT) >> > pol = vma->vm_policy; >> > return pol; >> > } >> > #endif > > afaict this would provide no way for pol to be set to task->mempolicy > for hugetlb per my comment above. Being dense I don't see a way for us to have ever returned task->mempolicy. So perhaps something else changed. Or something was made more consistent and another bug was revealed. I don't truly see the bug in shm_get_policy. I can see why it would not have the expected affect but that is different. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/