Hey, Guillem Jover schrob: > Actually, and this is something I've had pending raising for a long > time, I don't understand that two s-s-d invocations pattern in stop > which seems a bit pointless TBH? The --retry should make s-s-d wait > for the child to exit, or terminate it forcibly.
I think the idea behind that construct is preventing the following race condition: 1) s-s-d scans for matching processes, sends TERM to at least one worker child of the daemon, gets interrupted. 2) The worker child exits obligingly. 3) The daemon itself, not having gotten TERM yet, notices the SIGCHLD and starts a replacement worker process. 4) s-s-d gets resumed and proceeds with terminating the remaining processes it found in step 1, who subsequently die one way or the other. 4a) s-s-d does NOT rescan for newly appearing matches, thus missing the newly appeared worker from step 3. 5) Finding all matched processes gone, s-s-d exits 0. 6) The worker process from step 3 is still alive, causing issues. Notes: * I don't know whether s-s-d rescans in step 4a. But if it does, then I really can't see a point to all this. * You can replace steps 2+3 by "the daemon randomly decides to spawn a new child". * If the daemon, upon TERM, first kills its children, waits for them to die, and only then exits itself, then the race can't happen. That's what I'd expect as standard behaviour. But obviously, the init script also needs to work in situations where the daemon doesn't behave properly. > I also don't understand the diverging --name usage, because DAEMON > cannot have a different name to the --startas argument anyway, so it > cannot be used for interpreted scripts. Yes, that's weird. Let me think out loud here for a bit: --exec When matching, specifies the path of the executable running. That's the interpreter in case of scripts. (i.e. readlink /proc/$PID/exe) --startas How to --start the daemon. Defaults to --exec. --name Matches the /proc/$PID/comm entry (at least on Linux), which is the basename of the way it was started. It can be arbitrarily changed by the process (e.g. firefox-esr does this).¹ If that's correct, then a) "--start --startas $DAEMON --exec $DAEMON" (line 49) is redundant. b) "--stop --exec $DAEMON" (*both* stop invocations, line 78 and 89) will not match a $DAEMON that's a script, rather than a native executable. c) Hence init-d-script doesn't currently work with daemons that are implemented as a script.² d) Assuming $COMMAND_NAME is the basename of $DAEMON, then "--exec $DAEMON --name ${COMMAND_NAME}" is equivalent to "--exec $DAEMON" iff $DAEMON is the actual native executable (i.e. neither a symlink nor a script) and the process didn't change its "comm". e) The symlink case doesn't seem too interesting. A package should know which one's the real executable, and I can't come up with a use case for "stop only if started though *this* symlink". But maybe I'm missing things here. f) Presumably, whoever wrote the code had a daemon with children that changed their "comm" field, and the first s-s-d invocation is supposed to only kill the (parent) daemon. Hence, the first invocation matches on $PIDFILE and "comm", while the second invocation kills less discriminately. Thus I *think* the code does what it's supposed to do. Whether that's behaviour/complexity wanted enabled by default in /lib/init/init-d-script , I can't judge. Note tho that trying to support daemons implemented as scripts will necessarily blow away this code. Two more observations: * The second s-s-d --stop invocation has "--retry=0/30/KILL/5". What's that zero doing there? Shouldn't that be TERM instead, as in the first invocation? * Runit's sv(8) sends TERM and CONT to bring down a service gracefully, because a STOPped process obviously can't react to TERM. Maybe that's something you want to adopt as well? HTH and regards, Jan ¹ However, "comm" is distinct from argv[0], which *also* can be arbitrarily changed (e.g. openntpd does that). Argh. ² out-of-the box. Obviously, stuff can be overridden to make presumably anything work.
signature.asc
Description: PGP signature