On 9/29/24 12:55 PM, Zachary Santer wrote:
CWRU/CWRU.chlog:9/25 ---- jobs.c - wait_for_any_job: if the jobs table is empty and there are no eligible procsubs, and the shell is in posix mode, take a random pid from the bgpids table, delete it, and return its status (since we would be deleting that pid from bgpids anyway)This is a really strange thing to implement.
It turned an error condition into a potentially successful return.
So potentially 'wait -n' is waiting for a background job that's still executing when there's an already-terminated background job that 'wait -n' would report right then, had it not been notified.
This is reasonable. I changed the order in recent pushes.
On Wed, Sep 25, 2024 at 11:06 AM Chet Ramey <chet.ra...@case.edu> wrote:On 9/8/24 8:35 PM, Zachary Santer wrote: This is still a discussion about interactive shell behavior only.I might argue that calling 'jobs' within a script being executed normally shouldn't make background jobs that have already terminated unavailable to 'wait -n' either.
Make up your mind. Is non-interactive shell behavior good, as you've said before, and again at the end of this message, or is it not?
I'm considering using posix mode all the time, just to see if it makes my life easier. Not that I know what it does, outside of this.
You could read the file POSIX in the devel branch, or break down and read the texinfo manual, or look at http://tiswww.case.edu/~chet/bash/POSIX though that describes the current release.
That's not how job control works. Jobs are created and job numbers assigned when the background process is created.I was thinking the job id doesn't do the user any good if it's for a job that they don't have the opportunity to act upon. It's come and gone by the time they've got a command line again.
Act on how? If it terminates, you're not going to jobs/fg/bg/kill. You can still wait. (And who says they won't? Consider a SIGCHLD trap, or a command later in the list running one of the above commands.)
2) Job ids assigned to background jobs continue to increase monotonically, between accept-line and prompt, even as some of those jobs are removed from the jobs table by calls to 'wait'.No shell works this way, and there's not a good reason to adopt it.I might be missing something, but bash sure seems to be doing this in a number of different calls to wait-n-failure::main, on the current devel branch commit. Are the jobs not being removed from the jobs table until some later point?
What? When jobs are added to the jobs list, they use the index after the largest index with a job. If you have four jobs, 1-4, and job 3 terminates, the next job created gets index 5. If job 4 terminates instead, the next job created gets index 4. Is this not what you're seeing?
It feels like I've confused myself by this point. I was considering what bash would have to do, to not display job notifications at any point except immediately prior to displaying the following prompt. Bash is, of course, displaying job status notifications as jobs are forked and as they terminate. That *is* prior to the following prompt, but not *immediately* so. So the behavior I was trying to describe would be a departure from the current posix mode behavior, but it clearly isn't necessary.
Not in posix mode. This is how bash has always behaved in non-posix mode: $ sleep 3 & sleep 4; echo a [1] 48391 [1]+ Done sleep 3 a $ And this is posix mode: $ sleep 3 & sleep 4; echo a [1] 48395 a [1]+ Done sleep 3 $
If the 'jobs' builtin is called in the midst of a command list being run with either behavior, this would cause the same updates to the jobs table and list of saved pids and statuses as would occur immediately prior to a prompt.So you are saying that prompt notifications and `jobs' have the same effect. POSIX implies but does not require this, and there is differing behavior among current implementatations.I've got no opinion on this point, actually.
You just described it. Are you saying you don't mind either behavior?
The user would have to know that calling the 'jobs' builtin would have an impact on what processes 'wait -n' without id arguments will return the termination status of. That would have to be documented in the man page.This is posix mode.How does the user know that?
How does a user know anything? What's the difference between "documented in the man page" (presumably in JOB CONTROL or the `wait' description?) and "documented as part of posix mode"?
In that case, the behavior they would see, using 'wait -n', has already changed for the better. The use of 'wait -n' without pid arguments in an interactive environment is more likely to be something that a user just typed on the command line themselves.Why would a user do this? What's the use case for doing that in an interactive shell? Not that it really matters.Maybe testing out a bit of functionality they're trying to implement elsewhere.
I would hope that people understand the difference between interactive and non-interactive shells and how they can have different behavior.
If the behavior here isn't modified, the man page really should note that 'wait -n' without id arguments won't return the termination status of a child process that has already been notified through the 'jobs' output.That is exactly the behavior posix seems to require (`wait -n' aside, but see below): once you notify the user, you delete the job and it disappears forever.Should still be in the man page. Very few shell programmers are reading the POSIX standard.
I added a mention to the job control section in the man page and info file, and reworked the text in the posix mode section.
There is a posix mode section in bash.info.
There is also a URL in the man page that links to a file with the same information. The man "page" is already 95 pages, does it really need to be longer?
bash.info:This manual is meant as a brief introduction to features found in Bash. The Bash manual page should be used as the definitive reference on shell behavior.
That means that if the man page and the info file differ on something, the man page is authoritative, not that the info file is meaningless. Otherwise, why have it?
For instance, with the latest devel branch build: $ set -o posix $ sleep 2 & [1] 20565 $ [1]+ Done sleep 2 $ wait 20565 $ echo $? 0 $ wait 20565 bash: wait: pid 20565 is not a child of this shell $ echo $? 127 $ This is what you refer to below.Yeah, I think that's an improvement. As long as posix mode never makes that termination status unavailable to the first 'wait' call, because it was already notified, then posix mode seems like the way to go.
We'll see.
'wait -n' with pid arguments now has access to this list, which is good. It wouldn't be going much further to allow 'wait -n' without pid arguments to act on the list as well.Well, you'd either have to arrange things so the user doesn't get the same pid and status returned multiple times -- by removing it from this list or some other mechanism. Since that's what happens in posix mode, it looks like posix mode fits your use case here.And now I know that, but I don't even use 'wait -n' for anything.
Then we're just having an academic conversation.
The point here was to try to get the behavior of 'wait -n' to be as consistent as possible, between different execution environments: the interactive shell, a script being sourced, and a script being executed normally; along with different set and shopt options. If you won't consider modifying the behavior of 'wait -n' without id arguments in default mode, then that's frustrating.
You might want to try posix mode for a while and see what happens. There are very few people who do that; I'd be interested in feedback. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU c...@case.edu http://tiswww.cwru.edu/~chet/
OpenPGP_signature.asc
Description: OpenPGP digital signature