Hi Pavel, let me add lkml, we should not discuss this offlist.
On 03/20, Pavel Labath wrote: > > 1) we get a waitpid() notification that the tracee got SIGUSR1 > 2) we do a ptrace(GETSIGINFO) to get more info > 3) eventually we decide to restart the tracee with PTRACE_CONT, passing it > SIGUSR1 > 4) immediately after that we get another waitpid notification, again with > SIGUSR1, even though the thread had received no additional signals > 5) we again try to a GETSIGINFO, however this time it fails with ESRCH. > Therefore, we assume that the thread has died I found a similar bug by code inspection some time ago. I even have a fix, but I need to think more... And I even wrote the test-case ;) see below. But so far I can't say if you hit the same problem or not. If you can reproduce the problem, perhaps I can send you debugging patch? Oleg. #include <stdio.h> #include <unistd.h> #include <sys/wait.h> #include <sys/ptrace.h> #include <sys/syscall.h> #include <assert.h> #define tkill(pid, sig) \ syscall(__NR_tkill, pid, sig) void run_test(void) { int pid, stat; pid = fork(); if (!pid) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); raise(SIGSTOP); assert(0); } assert(pid == wait(&stat) && stat == 0x137f); tkill(pid, SIGTRAP); /* should not be reported */ tkill(pid, SIGKILL); assert(pid == wait(&stat)); if (stat == 0x9) return; printf("unexpected wait: stat=%x\n", stat); kill(0, SIGKILL); } int main(void) { int i = 8; /* random */ while (--i) if (!fork()) break; for (;;) run_test(); return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/