> > It does look in the table of saved exit statuses, returning 1. > > It doesn't. In this case, the code path it follows marks the job as dead > but doesn't mark it as notified (since it exited normally), so it's still > in the jobs list when `wait -n' is called, and available for returning. > That's probably a bug there.
Got it. So wait -n is intended to behave just as the documentation says -- "next" job -- and if there's a bug it's with how normally-exiting processes are handled, not signal-exiting processes. Thank you for your patience. > > There's also an interaction in that "wait" will only look at the > > terminated table if "-n" is not specified *and* ids are specified. > > This is to maintain POSIX semantics, with extensions. This is one of the > issues -- should `wait -n' with arguments look for terminated processes > in that table, the way `wait' without options does? Yes, I do want wait -n to look in the terminated table, at least for my use case responding to jobs finishing, one at a time, as soon as possible. I don't think wait -n can reliably do this since there is always a race between a job finishing/being handled, the next job finishing, and the subsequent call to wait -n. Even if I query "jobs" to see if multiple jobs have terminated, the next finishing job could still race. You've pointed out clearly that my mental model of wait -n was wrong so please bear with me if I still don't have this right. Is there some other best practice for this use case? It might be "use a SIGCHLD handler and query jobs to see what jobs have terminated, then call wait <pid> on each" or "I don't recommend using bash/sh for this." Obviously I could also be overlooking some aspect of wait -n or other bash features that would help here. I _don't_ want bash to maintain some sort of internal state about which jobs have and haven't been returned by wait -n, which would be complicated and brittle (this is what my mental model was). I'd want it to look in the terminated table for finished jobs amongst the provided list of pids, and then I'd manage the list of pids myself, removing pids that were previously returned from wait -n. This is a change in semantics and might introduce inconsistencies and difficulty implementing, I'm just describing what I think would be useful for my specific needs. A bit of brainstorming: between Linux's pidfds and BSD's kqueue/process descriptors one ought to be able to build this as an external command that polls for non-child processes to terminate. It couldn't return an exit status, but it could at least indicate which process finished or couldn't be found and thus had already finished. Then you could use posix "wait <pid>" to get the exit status and be guaranteed that it wouldn't block (a simple timeout option to wait might be useful here for cases where bash's child process may not be visible to an external command). I'm not aware of anything like this existing, but it would be a nice way to separate this functionality from the shell, reduce the number of options in wait, and support other shells. Again, thanks for your patience Chet, Steve