* Linus Torvalds <torva...@linux-foundation.org> wrote: > On Jun 12, 2015 00:23, "Ingo Molnar" <mi...@kernel.org> wrote: > > > > We might make it so: but that would mean restricting certain clone_flags > > variants - not sure that's possible with our current ABI usage? > > We already do that. You can't share signal info unless you share the mm. And > a > shared signal state is what defines a thread group. > > So I think the only issue is that ->mm can become NULL when the thread group > leader dies - a non-NULL mm should always be shared among all threads.
Indeed, we do that in exit_mm(). So we could add tsk->mm_leader or so, which does not get cleared and which the scheduler does not look at, but I'm not sure it's entirely safe that way: we don't have a refcount, and when the last thread exits it becomes bogus for a small window until the zombie leader is unlinked from the task list. To close that race we'd have __mmdrop() or so clear out tsk->mm_leader - but the task doing the mmdrop() might be a lazy thread totally unrelated to the original thread group so we don't know which tsk->mm_leader to clear out. To solve that we'd have to track the leader owning an MM in mm_struct - which gets interesting for the exec() case where the thread group gets a new leader, so we'd have to re-link the mm's leader pointer there. So unless I missed some simpler solution there a good number of steps where this could go wrong, in small looking race windows - how about we just live with iterating through all tasks instead of just all processes, once per 512 GB of memory mapped? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/