* Linus Torvalds <torva...@linux-foundation.org> wrote:

> On Jun 12, 2015 00:23, "Ingo Molnar" <mi...@kernel.org> wrote:
> >
> > We might make it so: but that would mean restricting certain clone_flags 
> > variants - not sure that's possible with our current ABI usage?
> 
> We already do that. You can't share signal info unless you share the mm. And 
> a 
> shared signal state is what defines a thread group.
> 
> So I think the only issue is that ->mm can become NULL when the thread group 
> leader dies - a non-NULL mm should always be shared among all threads.

Indeed, we do that in exit_mm().

So we could add tsk->mm_leader or so, which does not get cleared and which the 
scheduler does not look at, but I'm not sure it's entirely safe that way: we 
don't 
have a refcount, and when the last thread exits it becomes bogus for a small 
window until the zombie leader is unlinked from the task list.

To close that race we'd have __mmdrop() or so clear out tsk->mm_leader - but 
the 
task doing the mmdrop() might be a lazy thread totally unrelated to the 
original 
thread group so we don't know which tsk->mm_leader to clear out.

To solve that we'd have to track the leader owning an MM in mm_struct - which 
gets 
interesting for the exec() case where the thread group gets a new leader, so 
we'd 
have to re-link the mm's leader pointer there.

So unless I missed some simpler solution there a good number of steps where 
this 
could go wrong, in small looking race windows - how about we just live with 
iterating through all tasks instead of just all processes, once per 512 GB of 
memory mapped?

Thanks,

        Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to