On 11/12/2013 07:35 AM, Steffen Dettmer wrote:
Debian 7.2 with /bin/bash as login shell (via /etc/passwd), shopt
huponexit off (as by default), bash run via SSH from other host.
When closing shell with CTRL-D, "sleep &" continues to run. I had
expected I had to use nohup, setsid, disown or a combination of them
in order to keep background jobs running after ending a shell session.
Short answer: I doubt that this ever worked as you think it did; if you're
using a shell with job control and run programs in the background, the shell
needs to deliver the HUP signal, which can happen in one of two ways in
bash: huponexit on; or SIGHUP delivered to bash.
Long answer:
SIGHUP is not necessarily sent by the shell to background processes when it
exits, but more often by the controlling tty's driver or line discipline,
which on most Unixes (Linux included) is sadly a morass of cruft with
multiple APIs that evolved separately and later merged.
Back in the bad old days, when one used a "real" terminal on RS232 and
turned off the terminal, or logged in via a modem connected to the system
via RS232 and "hung up" the phone, the DSR line would fall and both
foreground and background processes would get SIGHUP (hang up) from the tty
driver, because it was the "controlling tty" (a concept that still exists
today even though real terminals are almost extinct). Keep in mind, unless
one was using the C shell, this was before "job control".
Fast-forward to today: bash by default uses job control except when
executing a script, and in the case of SSH, a pseudo-tty is used to simulate
the "real" device and its driver [details at pty(7)].
If you have a read of setpgid(2) [also of interest tty_ioctl(4)], you'll see
that (basically) on hangup of the tty device a SIGHUP is delivered to the
"foreground process group of the controlling terminal". Without defining
the "foreground process group" too carefully, suffice it to say that
processes can be put in or out of it via system calls like setpgid(2) by the
shell, various "daemon starting" programs, or themselves. More important,
we can easily see which processes are in it by looking at the pgid and tpgid
columns of ps(1)'s output.
For the final piece of the puzzle, check the relevant section of bash(1):
"The shell exits by default upon receipt of a SIGHUP. Before exiting, an
interactive shell resends the SIGHUP to all jobs [...] If the huponexit
shell option has been set with shopt, bash sends a SIGHUP to all jobs when
an interactive login shell exits." [An 'interactive' shell means
(basically) one that is running on a tty rather than reading a script from a
file.]
A little investigation with ps will show see why your sleep process didn't
receive a SIGHUP: when job control is enabled, bash moves background jobs
out of the foreground process group; they therefore won't receive a SIGHUP
from the tty driver, and since (a) you are exiting bash via EOF and (b) you
don't have huponexit set, bash doesn't send it to them. Note that had bash
exited due to receiving a SIGHUP *itself* (which would happen e.g. if sshd
died and released the pty), it would have delivered the SIGHUP to all of its
jobs, foreground and background, which is one reason why you want to use
commands like nohup, disown, etc. if you want to really be sure that your
background commands continue to run even after you logout.
The following session log demonstrates all of this. I use 'sleep 1h' and
'sleep 2h' to make clearer in the output of 'ps' which command was run by
'nohup' (but notice also the 'ignored' column).
~ % # ===>
~ % # ===> First, let's run some commands in the background with job-control
enabled.
~ % # ===>
~ % ssh localhost
...
~ % sleep 1h &
[1] 32187
~ % nohup sleep 2h &
[2] 32204
~ % nohup: ignoring input and appending output to `nohup.out'
~ % ps -o pid,ppid,pgid,sess,tpgid,tty,ignored,args
PID PPID PGID SESS TPGID TT IGNORED COMMAND
31723 31722 31723 31723 32281 pts/21 00384004 -bash
32187 31723 32187 31723 32281 pts/21 00000000 sleep 1h
32204 31723 32204 31723 32281 pts/21 00000001 sleep 2h
32281 31723 32281 31723 32281 pts/21 00000000 ps -o pid,ppid,
~ % #Notice ^^^^^ that the jobs have different PGID's from bash and the TPGID.
~ % exit
logout
Connection to localhost closed.
~ % # Check that both 'sleep' processes are still running:
~ % ps -eo pid,ppid,pgid,sess,tpgid,tty,ignored,args |grep sleep |grep -v grep
32187 1 32187 31723 -1 ? 0000000000000000 sleep 1h
32204 1 32204 31723 -1 ? 0000000000000001 sleep 2h
~ %
~ % # Demonstrate the effects of nohup on one of them:
~ % kill -HUP 32187 32204
~ % ps -eo pid,ppid,pgid,sess,tpgid,tty,ignored,args |grep sleep |grep -v grep
32204 1 32204 31723 -1 ? 0000000000000001 sleep 2h
~ %
~ % # OK, now kill it too:
~ % kill -TERM 32204
~ % ps -eo pid,ppid,pgid,sess,tpgid,tty,ignored,args |grep sleep |grep -v grep
~ %
~ % # ===>
~ % # ===> Try the same thing again, with job control disabled.
~ % # ===>
~ % ssh localhost
...
~ % set +m
~ % sleep 1h &
[1] 677
~ % nohup sleep 2h &
[2] 706
~ % nohup: appending output to `nohup.out'
~ % ps -o pid,ppid,pgid,sess,tpgid,tty,ignored,args
PID PPID PGID SESS TPGID TT IGNORED COMMAND
677 32636 32636 32636 32636 pts/21 00000006 sleep 1h
706 32636 32636 32636 32636 pts/21 00000007 sleep 2h
765 32636 32636 32636 32636 pts/21 00000000 ps -o pid,ppid,
32636 32635 32636 32636 32636 pts/21 00384004 -bash
~ % # Notice ^^^^ that this time all processes' PGID are the same, and is
the TPGID.
~ % exit
logout
Connection to localhost closed.
~ % # Now only the nohup-protected process remains:
~ % ps -eo pid,ppid,pgid,sess,tpgid,tty,ignored,args |grep sleep |grep -v grep
706 1 32636 32636 -1 ? 0000000000000007 sleep 2h
~ % kill 706
~ % # ===>
~ % # ===> One more time, *with* job-control, but terminate bash via SIGHUP
~ % # ===> (simulating a turned-off terminal, lost network connection, etc.)
~ % # ===>
~ % ssh localhost
[...]
~ % sleep 1h &
[1] 4643
~ % nohup sleep 2h &
[2] 4644
~ % nohup: ignoring input and appending output to `nohup.out'
~ % ps -o pid,ppid,pgid,sess,tpgid,tty,ignored,args
PID PPID PGID SESS TPGID TT IGNORED COMMAND
4580 4579 4580 4580 4646 pts/21 0000000000384004 -bash
4643 4580 4643 4580 4646 pts/21 0000000000000000 sleep 1
4644 4580 4644 4580 4646 pts/21 0000000000000001 sleep 2
4646 4580 4646 4580 4646 pts/21 0000000000000000 ps -o p
~ % #Separate^^^^ pgid's this time: the tty won't deliver SIGHUP, it's up to
the shell
~ % kill -HUP $$
Connection to localhost closed.
~ %
~ % # Non-nohup-protected process got SIGHUP; the other remains:
~ % ps -eo pid,ppid,pgid,sess,tpgid,tty,ignored,args |grep sleep |grep -v grep
4644 1 4644 4580 -1 ? 0000000000000001 sleep 2h
~ % kill 4644
-- David
--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/52825f12.9030...@meta-dynamic.com