Hi,
I've been toying around with this idea for a bit, and I wanted to solicit some
opinions on the
design and whether this seems like something that FreeBSD would be find useful
outside of my
own tree.
Background: I've been thinking a lot lately about secure product design and
threat models, and
wondering what kinds of things one could incorporate into their design as a
defense in depth
kind of thing. This is, of course, not something I would pitch as a strong
isolation mechanism,
but rather as a mechanism to protect against some less sophisticated threats.
The basic idea that I'm proposing is the ability to seal a jail to turn it into
a 'capsule'. You
can either seal it at creation time, or while it's already running. If you
create the jail as a
capsule, you must attach to it at the same time. Sealing it later is a
compromise to give the
system some runway to configure the jail first, presumably before other user
activity could
start and try to compromise the capsule before it's sealed.
Once sealed, the capsule has the following properties (that I've thought about,
at least):
- The capsule may not be unsealed
- Processes outside of the capsule may not attach to it
- Unprivileged users in the parent cannot see or tamper with processes in the
jail, regardless of
the security.bsd.see_* sysctls. persist and all of the
allow.unprivileged_* jail knobs will be
forcibly unset and result in errors if one attempts to set them after
- Privileged processes may see and signal the processes in a capsule if
securelevel is <= 0, but it
cannot attach to, debug, or cpuset individual processes in a capsule at any
securelevel
The premise of a capsule is that you (attempt to) seal off access points into
the jail besides for a
well-defined (by the software in the capsule) security boundary. It is
naturally not protected if
the kernel is compromised or in some other scenarios, but you eliminate a
number of threats where an
attacker can manage to make syscalls but doesn't have the tools available to
escalate further. Capsules
would simply be a building block to a larger secure design.
An obvious elephant in the room here is filesystem access. A capsule would
force an attacker to get
a little more creative if they want to tamper with capsule processes, in
particular if it's combined
with a heightened securelevel (or removal of other features like /dev/mem
entirely), but it does not
stop an attacker from filesystem tampering to disrupt capsule activities. This
kind of leaves a huge
part of protecting itself up to application design, which arguably eliminates
many benefits of the idea.
I don't really have a good answer for how one might solve that. The rest of
the design is fairly
straightforward to implement, but I would rather suspect it might get hairy if
you try to block off parts
of the filesystem (even from root, maybe contingent on securelevel) based on
whether the path has been
used for a capsule or not.
Comments/questions/tomatoes welcome. The idea was somewhat inspired by
enclaves and a design where one
can slice off some CPUs to dedicate to the capsule alone to try and mitigate
some side-channel
possibilities from other user processes, but the initial capsule thought
process doesn't go to the
extent of trying to carve out memory to dedicate to a capsule.
Thanks,
Kyle Evans