On 2/9/2011 3:57 PM, Linus Torvalds wrote: > [...] > The problem is that > 'set_job_status_and_cleanup()' does that > > if (wait_sigint_received&& (WTERMSIG (child->status) == SIGINT)&& .. > > which just looks totally buggy and racy. There's even a comment about > it in the bash source code, for chrissake! > > Here's the scenario: > > - wait_for() sets wait_sigint_received to zero (look for the comment > here!), and installs the sigint handler > - it does other things too, but it does waitchld() that does the > actual waitpid() system call > - now, imagine the following scenario: the ^C happens just as the > child already exited successfully! > - so bash itself gets the sigint, and sets wait_sigint_received to 1 > > So what happens? child->status will be successful (the child was not > interrupted by the signal, it exited at just the right time), but bash > saw the SIGINT. But because it thinks it needs to see *both* the > sigint _and_ the WTERMSIG(child->status)==SIGINT, bash essentially > ignores the ^C.
Do we really need to check wait_sigint_received here? If the child exits because of SIGINT was indeed received all the processes on the same terminal will also receive it. At least according to "Proper handling of SIGINT/SIGQUIT" (*). Maybe the patch should be --- bash-4.1/jobs.c~ctrlc_exit_race 2011-02-07 13:52:48.000000000 +0100 +++ bash-4.1/jobs.c 2011-02-07 13:55:30.000000000 +0100 @@ -3299,7 +3299,7 @@ set_job_status_and_cleanup (job) signals are sent to process groups) or via kill(2) to the foreground process by another process (or itself). If the shell did receive the SIGINT, it needs to perform normal SIGINT processing. */ - else if (wait_sigint_received&& (WTERMSIG (child->status) == SIGINT)&& + else if ((WTERMSIG (child->status) == SIGINT)&& IS_FOREGROUND (job)&& IS_JOBCONTROL (job) == 0) { int old_frozen; And then wait_sigint_received may not be needed altogether. > [...] > > Now, it does look like the problem is at least partly because bash has > a horrible time trying to figure out a truly ambigious case: did the > child process explicitly ignore the ^C or not? It looks like bash is > trying to basically ignore the ^C in the case the child ignored it. I > think that's misguided, but that does seem to be what bash is trying > to do. It's misguided exactly because there is absolutely no way to > know whether the child returned successfully because it just happened > to exit just before the ^C came in, or whether it blocked ^C and > ignored it. So even _trying_ to make that judgement call seems to be a > bad idea. I'm not an expert on Unix signals, but it seems that it is possible to tell. Via WTERMSIG (child->status). Again according to the "Proper handling of SIGINT/SIGQUIT" (*) the child is supposed to "rekill" itself if it wishes to do any cleanup upon receiving SIGINT or SIGQUIT and this way the parent will know that the termination was initiated by a signal. > And no, I don't know bash sources all that well. I played around with > them a long time ago, and for this I only glanced at it quickly to get > more of a view into what bash is trying to do (all thanks should go to > Oleg who already pinpointed the line that breaks). Maybe there are > subtle issues, maybe there are broken historical shell semantics here. This "Proper handling of SIGINT/SIGQUIT" article (that Chet Ramey <chet.ra...@case.edu> noted in a reply to the very first mail in this thread) contains a very nice explanation of why the shell is supposed to do this guessing. Article talks about 3 possible behaviors and bash goes the best called "wait and cooperative exit". It should not exit if child ignored the signal as some programs use SIGINT for their own purposes, not for termination and in this case the parent shell should just go on. Any way, the article explains it all really nice and with all the details :) > [...] (*) "Proper handling of SIGINT/SIGQUIT" http://www.cons.org/cracauer/sigint.html