On Wed, Apr 1, 2015 at 4:04 AM, Canek Peláez Valdés <can...@gmail.com> wrote:

> # If you have cgroups turned on in your kernel, this switch controls
> # whether or not a group for each controller is mounted under
> # /sys/fs/cgroup.
> [...]
> # Set this to YES if yu want all of the processes in a service's cgroup
> # killed when the service is stopped or restarted.
> # This should not be set globally because it kills all of the service's
> # child processes, and most of the time this is undesirable. Please set
> # it in /etc/conf.d/<service>.
> # To perform this cleanup manually for a stopped service, you can
> # execute cgroup_cleanup with /etc/init.d/<service> cgroup_cleanup or
> # rc-service <service> cgroup_cleanup.
> # rc_cgroup_cleanup="NO"

As pointed out in the comments, using this feature is apparently
unrecommended - probably because most init.d scripts were never
written with it in mind.  A few notes that might be helpful for
anybody trying this out, based on my systemd experiences (where this
is standard functionality, but units are written with this in mind).
Please note that I'm not 100% sure about how this is implemented in
openrc, so some potential issues below may be mitigated.

Also note, I'm not trying to make any value statements here (foo is
better than bar) - the purpose of my email is to help educate
sysadmins about some of the possible unintended consequences of using
features like these.

1. As far as I'm aware, openrc still doesn't have any concept of
scripts stopping/failing unless you explicitly tell it to stop them.
With systemd if the main process dies, the unit stops (and possibly
fails), and the child processes are killed automatically if this is
not overridden.  So, don't expect the behavior to be exactly the same.

2.  Some scripts like apache might attempt to do graceful shutdowns.
I have no idea how the kill behavior of openrc interacts with this.
With systemd care had to be taken in the script to ensure that kills
were only sent after a suitable timeout to allow graceful shutdown a
chance to complete - otherwise an apache2 graceful completes instantly
and SIGTERMs get sent almost immediately afterwards.  The openrc
init.d script already does its own attempts at polling/killing for a
restart, so you might get issues with how these features interact.

3.  Sometimes leaving orphan processes around might be considered
intended behavior.  Any screen launched from an ssh session is going
to be a child of sshd and in its cgroup. If you completely kill the
cgroup, then you'll kill any user sessions inside unless they were
given some kind of special handling. I'm actually not 100% sure how
this is done in systemd (logind may put these in a different cgroup
already), but you'll certainly want to think about things like this.

4.  Not really an issue for openrc, but if you're running systemd
timer units keep in mind that anything you fork from the main process
dies when the main process dies, so be careful about a cron shell
scripts that runs stuff in the background without waiting at the end.

I'd think that this is a feature openrc would want to make the default
at some point.  However, for that transition to be made maintainers
need to take another look at their scripts to make sure they still
work correctly.  That was never an issue for systemd since the
behavior was there from the start.

One thing I will say is that doing this sort of thing in the service
manager makes a LOT more sense than doing it in individual scripts.
Look at the apache2 init.d script sometime and compare it to the
systemd unit.  Most of the complexity in the init.d script is just
implementing stuff that systemd does natively, like graceful restarts
with cleanup of orphans and all that.  I'm not criticizing the apache2
script, but rather pointing out that one of the advantages of systemd
is that all of its units benefit from that kind of care without the
need to implement it in each script.  And, of course, killing child
processes can be configured per-service or even globally (though doing
it globally probably isn't advisable, since many units probably depend
on systemd to just send SIGTERMs followed by SIGKILLs as its default
action and stuff wouldn't stop at all without this).

-- 
Rich

Reply via email to