This is something I've had on my mind for a while, so the LKML thread made me try to write it down.
The reason I started digging into the jobserver code is the following: Currently, it's very hard to interact with the GNU Make jobserver, for the simple reason that one doesn't know if the pipe is blocking or non-blocking, meaning "downstream" users will have to implement two widely different strategies. On Linux, that can be worked around by opening /proc/self/fd/NN with O_NONBLOCK or not O_NONBLOCK as desired, since that gives a new file description (struct file* in kernel-speak). But it's even worse for "upstream", i.e. a build system (say, Yocto) that wants to set up a jobserver - without knowing whether Make expects a blocking or nonblocking pipe, I don't see how one can actually set oneself up as a top-level jobserver. So, here's a proposal that I'm sure is flawed, but I won't learn how unless I send it out: (1) keep SIGCHLD, SIGINT, SIGTERM (and whatever other signals that needs handling) blocked everywhere except where noted below - no need for SA_RESTART. (2) have a standard self-pipe for handling signals, all handled signals (including SIGCHLD) use the same handler which simply does write(sigpipe[1], &sig, sizeof(sig)) no signal-unsafe stuff at all. (3) use (and expect to inherit) a non-blocking jobserver pipe. (4) main loop (very roughly, of course) while ((jobs_running || eligible_jobs) && !stop) { struct pollfd pfd[2]; if (!quitsigs && eligible_jobs && !jobs_running) start_a_job(); add sigpipe[0] to pfd; if (eligible_jobs) add jobserver[0] to pfd; unblock_signals(); ret = poll(pfd, nfd, perhaps a timeout to deal with the "only if load is below foo" option); block_signals(); if (EINTR) continue; // or perhaps just a EINTR loop around poll if (sigpipe[0] is readable) { while (read sig != -EAGAIN) { switch (sig) { case SIGCHLD: reap_children(); // update jobs_running and eligible_jobs, write back tokens as appropriate, deal with a failed job, etc. break; case SIGINT: case SIGTERM: if (!quitsigs++) { print(waiting for jobs); } else { stop = 1; } } } } if (eligible_jobs && jobserver[0] is readable) { while (eligible_jobs && read a token) start_a_job(); } } This way, there's only one single place where we block, namely in the poll() call, and I don't see how we can miss an event (a child dying or a SIGTERM/SIGINT): If there are no eligible jobs, we will only return from poll() once we get a signal (first return may be EINTR, then we loop around and see the sigpipe is readable). If we do have eligible jobs, we may return from poll() because there's a token available, and then a signal may come in right after, before we block signals. In that case, we'll just do the "try to get a token (knowing that it may have been snatched by someone else), start a job", but the signal will be handled in the next loop iteration, and it's indistinguishable from the signal coming in right after blocking signals. Obviously, start_a_job() must (reset signal handlers and) unblock signals after fork() so the child inherits an expected environment, and the above implicitly relies on WNOHANG being available so we don't have to rely on fragile signal counting. But even legacy platforms without WNOHANG should be able to do the above: When handling a SIGCHLD, instead of a WNOHANG loop, just do blocking wait() (still with signals blocked) as long as jobs_running > 0. That will likely keep the tokens under-utilized, but it only affects a tiny minority of platforms [*]. In any case, I'd really appreciate if the jobserver protocol became more strictly defined, especially so that things above GNU Make could set up a jobserver. Perhaps (if GNU Make can actually always be made to work with a O_NONBLOCK pipe) as some base rules: (1) create a O_NONBLOCK pipe (2) fill the pipe initially with '+' tokens (3) always write back the token that was read, unless following a later revision of these rules that assign different meanings to certain tokens. Even just "these are the rules followed by GNU Make on modern platforms that have feature this and that" would be very helpful. Thanks, Rasmus [*] and it won't react to SIGTERM during the wait(), hrrmm... I think I have a way around that, but it's ugly and there's already way too many places above where I'm probably wrong.