On 05/16, Christian Brauner wrote:
>
> With the introduction of pidfds through CLONE_PIDFD it is possible to
> created pidfds at process creation time.

Now I am wondering why do we need CLONE_PIDFD, you can just do

        pid = fork();
        pidfd_open(pid);

> +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags)
> +{
> +     int fd, ret;
> +     struct pid *p;
> +     struct task_struct *tsk;
> +
> +     if (flags)
> +             return -EINVAL;
> +
> +     if (pid <= 0)
> +             return -EINVAL;
> +
> +     p = find_get_pid(pid);
> +     if (!p)
> +             return -ESRCH;
> +
> +     ret = 0;
> +     rcu_read_lock();
> +     /*
> +      * If this returns non-NULL the pid was used as a thread-group
> +      * leader. Note, we race with exec here: If it changes the
> +      * thread-group leader we might return the old leader.
> +      */
> +     tsk = pid_task(p, PIDTYPE_TGID);
> +     if (!tsk)
> +             ret = -ESRCH;
> +     rcu_read_unlock();
> +
> +     fd = ret ?: pidfd_create(p);
> +     put_pid(p);
> +     return fd;
> +}

Looks correct, feel free to add Reviewed-by: Oleg Nesterov <o...@redhat.com>

But why do we need task_struct *tsk?

        rcu_read_lock();
        if (!pid_task(PIDTYPE_TGID))
                ret = -ESRCH;
        rcu_read_unlock();

and in fact we do not even need rcu_read_lock(), we could do

        // shut up rcu_dereference_check()
        rcu_lock_acquire(&rcu_lock_map);
        if (!pid_task(PIDTYPE_TGID))
                ret = -ESRCH;
        rcu_lock_release(&rcu_lock_map);

Well... I won't insist, but the comment about the race with exec looks a bit
confusing to me. It is true, but we do not care at all, we are not going to
use the task_struct returned by pid_task().

Oleg.

Reply via email to