Re: systmd-analyze security as a release goal

Trent W. Buck Tue, 04 Jul 2023 17:40:25 -0700

Marco d'Itri <m...@linux.it> writes:

> On Jul 04, Andrey Rakhmatullin <w...@wrar.name> wrote:
>
>> Cool but looks like a lot of work.


[...]

>> start with applying all of them and then looking what needs to be
>> disabled?
> This is what I do.

FYI below is my basic workflow.
Once you've done 2-5 daemons, you get a "feel" for the trouble spots.
Total time to harden a unit from EXPOSURE=10 to EXPOSURE=3 usually takes me 1-4 
hours.
If I've used the daemon before & know its config format & source code, usually 
1 hour.

I typically start with a "deny all" ruleset.
Either I copy-paste from another daemon I did earlier, or
I copy-paste from "systemd-analyze security".
A slightly out-of-date one is
https://github.com/cyberitsolutions/prisonpc-systemd-lockdown/blob/main/systemd/system/0-EXAMPLES/20-default-deny.conf

Usually the daemon segfaults immediately.
In "coredumpctl" I see what the last syscall was.
Typically it is setuid so per I know to allowlist these:

    SystemCallFilter=@setuid
    CapabilityBoundingSet=CAP_SETUID CAP_SETGID

This is because the daemon does a no-op setuid(123) even if it's ALREADY 123 
(due to User=%p in frobozzd.service).
This could be patched away, but so far my policy has been
"focus on stuff that doesn't require patching", so
instead I just allow that syscall.

It is very common to need both AF_UNIX and AF_NETLINK, so I don't even try to 
block those.
Things that need network (e.g. postfix, nginx) would also need AF_INET, 
AF_INET6, IPAddressAllow=all, &c.

The next most common failure is being unable to write to somewhere due to 
ProtectSystem=strict,
so I look for things like /run/frobozzd.pid or /var/lib/frobozzd/state.db in 
the error logs (journalctl -u frobozzd).
If systemd's existing things like RuntimeDirectory=%p aren't enough to cover 
it, I add ReadWritePaths=, or
downgrade ProtectSystem=strict to ProtectSystem=yes.

If it's still crashing, I remove "SystemCallFilter=~@privileged @resources" and 
"CapabilityBoundingSet=" entirely.
If that works, I strace or bisect to find which syscalls must be allowlisted.

If it's *STILL* crashing, I bisect over the entire hardening denylist.
(Comment out half.  Does it work now?  If so, it's mad about the commented-out 
half.  Repeat.)


The hardest part is the rare case where a daemon will automatically detect that 
an action failed, then
*silently* switch to a less-secure mode.
It is very hard to spot this is happening until after the hardened unit has 
been in production for a month or two.


PS: I typically have a dev loop like:

      journalctl -fu frobozzd &

      while ! systemctl restart frobozzd;
      do systemctl edit frobozzd; done

    Or if it's on another host,

      M-! <hardening.conf ssh root@test '
          cat >/etc/systemd/system/frobozzd.service.d/hardening.conf;
          systemctl daemon-reload;
          systemctl restart frobozzd;
          systemctl status frobozzd'

PPS: so far I've been talking about system units, but
     user units can also have hardening!

     For example, I bet this only needs write access to /sys/blah/rfkill, and
     could have it's TCP privileges revoked:

        org.gnome.SettingsDaemon.Rfkill.service 9.8 UNSAFE 😨

     Also by default "systemd-analyze security" doesn't mention 
timer/path-fired units like e2scrub or fsck.
     If you want to see those you have to do something like "systemctl 
list-units --all --type=service".

Re: systmd-analyze security as a release goal

Reply via email to