Hi Pavel,

let me add lkml, we should not discuss this offlist.

On 03/20, Pavel Labath wrote:
>
> 1) we get a waitpid() notification that the tracee got SIGUSR1
> 2) we do a ptrace(GETSIGINFO) to get more info
> 3) eventually we decide to restart the tracee with PTRACE_CONT, passing it
> SIGUSR1
> 4) immediately after that we get another waitpid notification, again with
> SIGUSR1, even though the thread had received no additional signals
> 5) we again try to a GETSIGINFO, however this time it fails with ESRCH.
> Therefore, we assume that the thread has died

I found a similar bug by code inspection some time ago. I even have
a fix, but I need to think more... And I even wrote the test-case ;)
see below.

But so far I can't say if you hit the same problem or not. If you can
reproduce the problem, perhaps I can send you debugging patch?

Oleg.

#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <sys/syscall.h>
#include <assert.h>

#define tkill(pid, sig) \
        syscall(__NR_tkill, pid, sig)

void run_test(void)
{
        int pid, stat;

        pid = fork();
        if (!pid) {
                assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
                raise(SIGSTOP);
                assert(0);
        }

        assert(pid == wait(&stat) && stat == 0x137f);

        tkill(pid, SIGTRAP);    /* should not be reported */
        tkill(pid, SIGKILL);
        assert(pid == wait(&stat));
        if (stat == 0x9)
                return;

        printf("unexpected wait: stat=%x\n", stat);
        kill(0, SIGKILL);
}

int main(void)
{
        int i = 8; /* random */

        while (--i)
                if (!fork())
                        break;

        for (;;)
                run_test();

        return 0;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to