Oleg Nesterov <o...@redhat.com> writes: > Suppose we have 2 threads, the group-leader L and a sub-theread T, > both parked in ptrace_stop(). Debugger tries to resume both threads > and does > > ptrace(PTRACE_CONT, T); > ptrace(PTRACE_CONT, L); > > If the sub-thread T execs in between, the 2nd PTRACE_CONT doesn not > resume the old leader L, it resumes the post-exec thread T which was > actually now stopped in PTHREAD_EVENT_EXEC. In this case the > PTHREAD_EVENT_EXEC event is lost, and the tracer can't know that the > tracee changed its pid.
The change seems sensible. I don't expect this is common but it looks painful to deal with if it happens. Acked-by: "Eric W. Biederman" <ebied...@xmission.com> I am wondering if this should be expanded to all ptrace types for consistency. Or maybe we should set a flag to make this happen for all ptrace events. It just seems really odd to only worry about missing this event. I admit this a threaded PTRACE_EVENT_EXEC is the only event we are likely to miss but still. Do you by any chance have any debugger/strace test cases? I would think that would be the way to test to see if this breaks anything. I think I remember strace having a good test suite. > This patch makes ptrace() fail in this case until debugger does wait() > and consumes PTHREAD_EVENT_EXEC which reports old_pid. This affects all > ptrace requests except the "asynchronous" PTRACE_INTERRUPT/KILL. > > The patch doesn't add the new PTRACE_ option to not complicate the API, > and I _hope_ this won't cause any noticeable regression: > > - If debugger uses PTRACE_O_TRACEEXEC and the thread did an exec > and the tracer does a ptrace request without having consumed > the exec event, it's 100% sure that the thread the ptracer > thinks it is targeting does not exist anymore, or isn't the > same as the one it thinks it is targeting. > > - To some degree this patch adds nothing new. In the scenario > above ptrace(L) can fail with -ESRCH if it is called after the > execing sub-thread wakes the leader up and before it "steals" > the leader's pid. > > Test-case: > > #include <stdio.h> > #include <unistd.h> > #include <signal.h> > #include <sys/ptrace.h> > #include <sys/wait.h> > #include <errno.h> > #include <pthread.h> > #include <assert.h> > > void *tf(void *arg) > { > execve("/usr/bin/true", NULL, NULL); > assert(0); > > return NULL; > } > > int main(void) > { > int leader = fork(); > if (!leader) { > kill(getpid(), SIGSTOP); > > pthread_t th; > pthread_create(&th, NULL, tf, NULL); > for (;;) > pause(); > > return 0; > } > > waitpid(leader, NULL, WSTOPPED); > > ptrace(PTRACE_SEIZE, leader, 0, > PTRACE_O_TRACECLONE | PTRACE_O_TRACEEXEC); > waitpid(leader, NULL, 0); > > ptrace(PTRACE_CONT, leader, 0,0); > waitpid(leader, NULL, 0); > > int status, thread = waitpid(-1, &status, 0); > assert(thread > 0 && thread != leader); > assert(status == 0x80137f); > > ptrace(PTRACE_CONT, thread, 0,0); > /* > * waitid() because waitpid(leader, &status, WNOWAIT) does not > * report status. Why ???? > * > * Why WEXITED? because we have another kernel problem connected > * to mt-exec. > */ > siginfo_t info; > assert(waitid(P_PID, leader, &info, WSTOPPED|WEXITED|WNOWAIT) > == 0); > assert(info.si_pid == leader && info.si_status == 0x0405); > > /* OK, it sleeps in ptrace(PTRACE_EVENT_EXEC == 0x04) */ > assert(ptrace(PTRACE_CONT, leader, 0,0) == -1); > assert(errno == ESRCH); > > assert(leader == waitpid(leader, &status, WNOHANG)); > assert(status == 0x04057f); > > assert(ptrace(PTRACE_CONT, leader, 0,0) == 0); > > return 0; > } > > Signed-off-by: Oleg Nesterov <o...@redhat.com> > --- > kernel/ptrace.c | 24 +++++++++++++++++++++++- > 1 file changed, 23 insertions(+), 1 deletion(-) > > diff --git a/kernel/ptrace.c b/kernel/ptrace.c > index 43d6179508d6..1037251ae4a5 100644 > --- a/kernel/ptrace.c > +++ b/kernel/ptrace.c > @@ -169,6 +169,27 @@ void __ptrace_unlink(struct task_struct *child) > spin_unlock(&child->sighand->siglock); > } > > +static bool looks_like_a_spurious_pid(struct task_struct *task) > +{ > + int pid; > + > + if (task->exit_code != ((PTRACE_EVENT_EXEC << 8) | SIGTRAP)) > + return false; > + > + rcu_read_lock(); > + pid = task_pid_nr_ns(task, task_active_pid_ns(task->parent)); > + rcu_read_unlock(); > + > + if (pid == task->ptrace_message) > + return false; > + /* > + * The tracee changed its pid but the PTRACE_EVENT_EXEC event > + * was not wait()'ed, most probably debugger targets the old > + * leader which was destroyed in de_thread(). > + */ > + return true; > +} > + > /* Ensure that nothing can wake it up, even SIGKILL */ > static bool ptrace_freeze_traced(struct task_struct *task) > { > @@ -179,7 +200,8 @@ static bool ptrace_freeze_traced(struct task_struct *task) > return ret; > > spin_lock_irq(&task->sighand->siglock); > - if (task_is_traced(task) && !__fatal_signal_pending(task)) { > + if (task_is_traced(task) && !looks_like_a_spurious_pid(task) && > + !__fatal_signal_pending(task)) { > task->state = __TASK_TRACED; > ret = true; > } Eric