Oleg Nesterov <o...@redhat.com> writes:

> Suppose we have 2 threads, the group-leader L and a sub-theread T,
> both parked in ptrace_stop(). Debugger tries to resume both threads
> and does
>
>       ptrace(PTRACE_CONT, T);
>       ptrace(PTRACE_CONT, L);
>
> If the sub-thread T execs in between, the 2nd PTRACE_CONT doesn not
> resume the old leader L, it resumes the post-exec thread T which was
> actually now stopped in PTHREAD_EVENT_EXEC. In this case the
> PTHREAD_EVENT_EXEC event is lost, and the tracer can't know that the
> tracee changed its pid.

The change seems sensible.  I don't expect this is common but it looks
painful to deal with if it happens.

Acked-by: "Eric W. Biederman" <ebied...@xmission.com>

I am wondering if this should be expanded to all ptrace types for
consistency.  Or maybe we should set a flag to make this happen for
all ptrace events.

It just seems really odd to only worry about missing this event.
I admit this a threaded PTRACE_EVENT_EXEC is the only event we are
likely to miss but still.

Do you by any chance have any debugger/strace test cases?

I would think that would be the way to test to see if this breaks
anything.  I think I remember strace having a good test suite.

> This patch makes ptrace() fail in this case until debugger does wait()
> and consumes PTHREAD_EVENT_EXEC which reports old_pid. This affects all
> ptrace requests except the "asynchronous" PTRACE_INTERRUPT/KILL.
>
> The patch doesn't add the new PTRACE_ option to not complicate the API,
> and I _hope_ this won't cause any noticeable regression:
>
>       - If debugger uses PTRACE_O_TRACEEXEC and the thread did an exec
>         and the tracer does a ptrace request without having consumed
>         the exec event, it's 100% sure that the thread the ptracer
>         thinks it is targeting does not exist anymore, or isn't the
>         same as the one it thinks it is targeting.
>
>       - To some degree this patch adds nothing new. In the scenario
>         above ptrace(L) can fail with -ESRCH if it is called after the
>         execing sub-thread wakes the leader up and before it "steals"
>         the leader's pid.
>
> Test-case:
>
>       #include <stdio.h>
>       #include <unistd.h>
>       #include <signal.h>
>       #include <sys/ptrace.h>
>       #include <sys/wait.h>
>       #include <errno.h>
>       #include <pthread.h>
>       #include <assert.h>
>
>       void *tf(void *arg)
>       {
>               execve("/usr/bin/true", NULL, NULL);
>               assert(0);
>
>               return NULL;
>       }
>
>       int main(void)
>       {
>               int leader = fork();
>               if (!leader) {
>                       kill(getpid(), SIGSTOP);
>
>                       pthread_t th;
>                       pthread_create(&th, NULL, tf, NULL);
>                       for (;;)
>                               pause();
>
>                       return 0;
>               }
>
>               waitpid(leader, NULL, WSTOPPED);
>
>               ptrace(PTRACE_SEIZE, leader, 0,
>                               PTRACE_O_TRACECLONE | PTRACE_O_TRACEEXEC);
>               waitpid(leader, NULL, 0);
>
>               ptrace(PTRACE_CONT, leader, 0,0);
>               waitpid(leader, NULL, 0);
>
>               int status, thread = waitpid(-1, &status, 0);
>               assert(thread > 0 && thread != leader);
>               assert(status == 0x80137f);
>
>               ptrace(PTRACE_CONT, thread, 0,0);
>               /*
>                * waitid() because waitpid(leader, &status, WNOWAIT) does not
>                * report status. Why ????
>                *
>                * Why WEXITED? because we have another kernel problem connected
>                * to mt-exec.
>                */
>               siginfo_t info;
>               assert(waitid(P_PID, leader, &info, WSTOPPED|WEXITED|WNOWAIT) 
> == 0);
>               assert(info.si_pid == leader && info.si_status == 0x0405);
>
>               /* OK, it sleeps in ptrace(PTRACE_EVENT_EXEC == 0x04) */
>               assert(ptrace(PTRACE_CONT, leader, 0,0) == -1);
>               assert(errno == ESRCH);
>
>               assert(leader == waitpid(leader, &status, WNOHANG));
>               assert(status == 0x04057f);
>
>               assert(ptrace(PTRACE_CONT, leader, 0,0) == 0);
>
>               return 0;
>       }
>
> Signed-off-by: Oleg Nesterov <o...@redhat.com>
> ---
>  kernel/ptrace.c | 24 +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index 43d6179508d6..1037251ae4a5 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -169,6 +169,27 @@ void __ptrace_unlink(struct task_struct *child)
>       spin_unlock(&child->sighand->siglock);
>  }
>  
> +static bool looks_like_a_spurious_pid(struct task_struct *task)
> +{
> +     int pid;
> +
> +     if (task->exit_code != ((PTRACE_EVENT_EXEC << 8) | SIGTRAP))
> +             return false;
> +
> +     rcu_read_lock();
> +     pid = task_pid_nr_ns(task, task_active_pid_ns(task->parent));
> +     rcu_read_unlock();
> +
> +     if (pid == task->ptrace_message)
> +             return false;
> +     /*
> +      * The tracee changed its pid but the PTRACE_EVENT_EXEC event
> +      * was not wait()'ed, most probably debugger targets the old
> +      * leader which was destroyed in de_thread().
> +      */
> +     return true;
> +}
> +
>  /* Ensure that nothing can wake it up, even SIGKILL */
>  static bool ptrace_freeze_traced(struct task_struct *task)
>  {
> @@ -179,7 +200,8 @@ static bool ptrace_freeze_traced(struct task_struct *task)
>               return ret;
>  
>       spin_lock_irq(&task->sighand->siglock);
> -     if (task_is_traced(task) && !__fatal_signal_pending(task)) {
> +     if (task_is_traced(task) && !looks_like_a_spurious_pid(task) &&
> +         !__fatal_signal_pending(task)) {
>               task->state = __TASK_TRACED;
>               ret = true;
>       }

Eric

Reply via email to