How can a parent process intercepting a ptraced child's system calls using PTRACE_SYSCALL know which SIGTRAP's are system call entries and which are system call exits?
Since there does not seem to be a way for the parent to get this information from the system, ptrace examples floating around on the web generally use a boolean in_syscall flag that they toggle at each SIGTRAP. This mechanism seems perfectly adequate, but for it to work correctly the flag must obviously be appropriately initialized, and that is where things get vague. Let's assume that the child's ptrace-ing begins after it first does PTRACE_TRACEME and then calls execve. According to the ptrace man page, after doing PTRACE_TRACEME "all subsequent calls to exec() by this process will cause a SIGTRAP to be sent to it, giving the parent a chance to gain control before the new program begins execution". The problem is that this does not specify whether the parent will see SIGTRAP's for both entry to and exit from execve, or only for exit from it. To my great surprise, I observed both behaviors on actual machines (in particular, I observed the former on an x86_64 dual core machine). This inconsistency foils the in_syscall flag approach, as there is no way to know whether the parent should initialize that flag to true or false (since apparently both can occur). Hence, my question is: is this behavior intentional? Should there not be a guarantee that /either/ only the execve exit, /or/ both entry and exit, are SIGTRAPed? I have attached a testcase program. On some machines (2.6.18.8-0.5-default #1 SMP x86_64, and a 2.6.18.8-0.3-default #1 SMP i686) it shows only one initial execve occurence, while on another (2.6.18-8.1.8.el5 #1 SMP x86_64, dual core) it shows two. Regards, Eelis
#include <assert.h> #include <sys/ptrace.h> #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> #include <syscall.h> #include <sys/reg.h> #include <stdio.h> #ifdef __x86_64__ #define SYSCALL_OFF (ORIG_RAX * 8) #else #define SYSCALL_OFF (ORIG_EAX * 4) #endif int main() { pid_t const child = fork(); if (child == -1) perror("fork failed"); if(child == 0) { ptrace(PTRACE_TRACEME, 0, NULL, NULL); execl("/usr/bin/whoami", "whoami", NULL); perror("execl failed"); } for(;;) { int status; wait(&status); if(WIFEXITED(status)) break; assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); long const syscall = ptrace(PTRACE_PEEKUSER, child, SYSCALL_OFF, NULL); printf("%ld ", syscall); ptrace(PTRACE_SYSCALL, child, NULL, NULL); } printf("\n"); return 0; }