It turns out that sh(1) has a bug (maybe one I created when I reconstructed the wait builtin command, or perhaps it was there before, and I just rearranged it, it isn't worth the effort to trace the history) when the wait builtin command is applied to a job that stops.
It must be a bug, as the results are inconsistent, depending upon whether the job stopped before or after the wait command is issued - if it stops while the wait builtin is running, wait returns, with a status indicating that the job "exited" with status of the signal that caused the job (or just process perhaps) to stop. If the job was already stopped, the wait builtin says it doesn't exist. Neither of those is really correct - POSIX requires the wait builtin to wait until the process has terminated, and then complete with the appropriate status (with caveats relating to signals interrupting the wait while it is waiting, which are not relevant right now). This came to light when Harald van Dijk sent the following small test case to the Austin Group (POSIX maintainers) mailing list: sleep 10 & fg <Ctrl-Z> wait $! kill -l $? and reported on the results from a bunch of shells (not including ours or the FreeBSD sh). There was little consistency. A simple reading of POSIX would require the wait command there to wait forever (or until some external agent sent SIGCONT to the sleep process). None of the shells he tested did that (but the FreeBSD shell does, though it ends up reporting the status from the completed job twice, if a SIGCONT is sent - once via wait, and then again as a background job completing, which is wrong - either of those should remove the job from the jobs table making it unavailable for the other, depending which happens first ... here that should be the wait). The "best" of the shells (for this anyway) recognise that the wait will hang forever (usually), and effectively turn it into a "fg" command, resuming the stopped job that is to be awaited, waiting, and then exiting (only ksh93 got the correct status from the sleep however). Since I clearly need to fix the inconsistency in the NetBSD shell (which of itself is not hard - it just forgot the possibility that jobs might be stopped when looking to see if the process is ready) I thought it might be a good idea to fix all of this properly. Note that none of this makes much practical difference - regular people just don't issue command like Harald did - if you have just stopped a job with a ^Z you don't usually immediately issue a wait command for it! And in a script, usually the script and its background processes are all in the same process group, and most often, jobs stop due to signals sent to the process group (^Z (SIGTSTP), or STGTTOU or SIGTTIN), which result in the shell, and whatever process(es) are running all stopping, and usually, resuming, together. (This is not guaranteed, a script can turn on -m, which runs background processes in their own process groups, and the shell's children can be stopped via one of those signals (or SIGSTOP) being sent to it via the kill sys call, but in practice neither of those normally happens). The question for now is what our behaviour ought to be in these odd cases. My inclination is to make wait behave as POSIX specifies, and only return (normally) when the process named (or all children, if there are no args, or any of them with our -n option) have exited. Then add an option to wait (probably -s) to indicate that wait should complete if the (or a, or all, depending upon its usage) process enters stopped state (and return as status the standard shell wait encoded status for "exited with signal N" except it would be interpreted as "stopped by signal N". My inclination is to go that way, rather than having default wait complete when a (selected) job stops, with a possible option to avoid that, as I have not seen almost any scripts which use wait, which are capable of dealing with stopping children. Not to say that none exist, just that they're by far the more unusual case. In addition, I'm inclined to copy ksh93 (and zsh) and have wait resume a stopped job that it is to await job completion (though that could be by an option, or with an option to inhibit it) -- waiting for a stopped job would obviously not do that. But before I make any of that happen, I'd like to read opinions of others about how all of this should work (just don't bother with "change nothing" as what we have now is clearly wrong). kre