On 01/22, Alex Thorlton wrote: > > At a glance, without testing, it looks like a good idea to me. By > using def_flags, we leverage functionality that's already in place to > achieve the same result. We don't need to add any new checks into the > fault path or into khugepaged, since we're just leveraging the > VM_HUGEPAGE/NOHUGEPAGE flag, which we already check for. We also get > the behavior that you suggested (madvise is still respected, even with > the new THP disable prctl set), for free with this method.
Yes, exactly. > I like the idea, but I think that it should probably be a separate > change from the other few cleanups that you proposed along with it, Yes, sure, that is why I sent them separately, > since > they're somewhat unrelated to this particular issue. Do you agree? Not really. Note that without 1/2 VM_NOHUGEPAGE won't survive after exec. And without 2/2 madvise(MADV_HUGEPAGE) won't work after PR_SET_THP_DISABLE. But again, I think that these 2 simple cleanups make sense even without PR_SET_THP_DISABLE. > > diff --git a/kernel/sys.c b/kernel/sys.c > > index ac1842e..eb8b0fc 100644 > > --- a/kernel/sys.c > > +++ b/kernel/sys.c > > @@ -2029,6 +2029,19 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, > > arg2, unsigned long, arg3, > > if (arg2 || arg3 || arg4 || arg5) > > return -EINVAL; > > return current->no_new_privs ? 1 : 0; > > + case PR_SET_THP_DISABLE: > > + case PR_GET_THP_DISABLE: > > + down_write(&me->mm->mmap_sem); > > + if (option == PR_SET_THP_DISABLE) { > > + if (arg2) > > + me->mm->def_flags |= VM_NOHUGEPAGE; > > + else > > + me->mm->def_flags &= ~VM_NOHUGEPAGE; > > + } else { > > + error = !!(me->mm->flags && VM_NOHUGEPAGE); > > Should be: > > error = !!(me->mm->def_flags && VM_NOHUGEPAGE); No, we need to return 1 if this bit is set ;) Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/