Date: Tue, 15 Mar 2022 16:48:09 +0100 From: Edgar =?iso-8859-1?B?RnXf?= <e...@math.uni-bonn.de> Message-ID: <yjc1or7f2yj4x...@trav.math.uni-bonn.de>
| I guess "enters stopped state" includes the case where the process | already was in the stopped state when the wait command was issued? Yes, sequencing doesn't matter (though it makes a difference with the current code, that's a prime motivation for fixing it). | I don't have any strong opinion, but also find it slightly more natural | that way. [normal wait only waits for exited jobs] Yes, that is in my sources now, being tested (very slowly, as I find the time... this has been this way for quite a while, there's no great hurry to fix it, if I miss the -10 branch, this can easily be pulled up). So, as there were no comments or objections (until your comment just now, and apart from a useful off-list discussion with Mouse which helped me clarify some things I was thinking), that's the way things are going to go. First commit will be to just make wait(1) (in sh) ignore stopped jobs. (They will be treated identically to running ones). Next an option will get added to allow wait to return stopped processes as well as exited ones (I briefly considered allowing waiting for only stopped processes, ignoring exited ones, but doing that makes no sense at all, so that won't happen). I will probably also implement a do-nothing option to wait only for exited jobs, in the hope that those shells where wait waits for stopped and exited jobs by default will pick up that option, and wait only for exited ones - and ideally also our wait for stopped or exited jobs option, which would be a no-op for them. This is just to allow portable scripts to be written. Lastly, and just possibly, this I am not sure of yet, probably won't be until I implement it, and try it out for a while to see whether it works sanely, if the shell is waiting for a job (without the option to return stopped jobs) and that job stops, then resume it in the foreground. It has to be foreground, if backgrounded, whatever made it stop previously will probably make it stop again immediately, which would result in the shell simply going into a loop continually restarting that job. The discussions with Mouse raised the question of waiting on multiple jobs (as for example, the simple "wait" command with no args at all) if several of them stop. For that, we would need to restart them in foreground, serially. As in the last paragraph, they cannot be in the background, and since each separate job will be in its own process group, only one of them can be foreground at a time (the controlling tty must be in the foreground job's process group). Hence, as unappealing as it sounds, when waiting for more than one job, if more than one of them stops, resume one, wait for that one to finish (any others that finish without stopping are fine of course, and get included in the wait), then restart the next, wait for it (restarting it again if it stops again) until it completes, and on to the next... Aside from not doing the restart at all, which is certainly still a possible outcome, there really is no other choice (in that case, the shell would simply wait forever, and the stopped job(s) would need to be continued, or killed, externally). And finally, to repeat something I said last time ... this is all just an obscure corner case. In practice, people rarely do wait inteactively, and if they do, and a job stops (and we don't do auto restart in that case) the "stopped" message will still appear on stderr (or stdout, or wherever those things appear now, I've forgotten) and the user can SIGINT the wait and carry on. None of this applies to normal scripts, which don't usually enable job control, so the shell running the script, and everything it runs, are all in the same process group, if something stops one of the processes (other that a SIGSTOP sent just to one process - which must be from some other agent, which can be assumed will SIGCONT it when appropriate) then the whole process group gets signalled, which stops the shell along with its children. There's no question of what the shell should do in this state - the only possibility is nothing. So, none of what happens as a result of this "discussion" should affect anything that almost anyone ever sees. kre | Long ago, I used processes stopping themselves as a primitive synchronisation tool (not from a shell script, however). I used an ELC to feed four CD writers, which worked well when the four cdrdao processes were in sync, but miserably failed otherwise. So I added a --stop option to cdrdao which stopped the process as soon as the lengthy initialization was complete and then manually issued a kill -CONT to make them continue. |