On 02/25, Oleg Nesterov wrote:
>
> On 02/24, Pavel Tikhomirov wrote:
> >
> > This moves the condition (tid != 1 && !tmp->child_reaper) to after idr
> > alloc, so it not only covers that first process in pid namespace has pid
> > 1 in case of clone3(set_tid) requesting wrong pid, but also if idr
> > itself gives wrong pid for some reason.
> >
> > This could've been the case before this patch, when creating first
> > process the alloc_pid()->pidfs_add_pid() code path fails, so that the
> > idr->idr_next is non zero anymore and next process calling to
> > alloc_pid(), will get 2 as a pid from idr_alloc_cyclic(). Effectively
> > leading to init-less pid namespace, which is a bug.
>
> Yes.
>
> alloc_pid() does:
>
>       /* On failure to allocate the first pid, reset the state */
>       if (ns->pid_allocated == PIDNS_ADDING)
>               idr_set_cursor(&ns->idr, 0);
>
> but this logic is broken.
>
> Suppose that a task P does sys_unshare(CLONE_NEWPID). Then it does
> fork(), and fork() fails for any reason after alloc_pid() succeeds.
> If P does another fork() to retry, we have a bug.
>
> So with this patch we can either remove the code above, or (better)
> improve this logic.

Damn, no, I was wrong ;) As Andrei pointed out, free_pid() will clear
PIDNS_ADDING in this case.

Still I think the patch is fine.

Oleg.


Reply via email to