Re: RFC: seccomp-bpf support

Andres Freund Wed, 28 Aug 2019 11:41:48 -0700

Hi,

On 2019-08-28 14:23:00 -0400, Joshua Brindle wrote:
> > or some similar technology where the filtering is done by logic
> > that's outside the executable you wish to not trust.
> > (After googling for libseccomp, I see that it's supposed to not
> > allow syscalls to be turned back on once turned off, but that isn't
> > any protection against this problem.  An attacker who's found an ACE
> > hole in Postgres can just issue ALTER SYSTEM SET to disable the
> > feature, then force a postmaster restart, then profit.)


A postmaster restart might not be enough, because the postmaster's
restrictions can't be removed, once in place. But all that's needed to
circumvent that is force postmaster to die, and rely on systemd etc to
restart it.


> My preference would have been to enable it unconditionally but Joe was
> being more practical.

Well, the current approach is to configure the list of allowed syscalls
in postgres. How would you ever secure that against the attacks
described by Tom? As long as the restrictions are put into place by
postgres itself, and as long they're configurable, such attacks are
possible, no?  And as long as extensions etc need different syscalls,
you'll need configurability.


> > I follow the idea of limiting the attack surface for kernel bugs,
> > but this doesn't seem like a useful implementation of that, even
> > ignoring the ease-of-use problems Peter mentions.
> 
> Limiting the kernel attack surface for network facing daemons is
> imperative to hardening systems. All modern attacks are chained
> together so a kernel bug is useful only if you can execute code, and
> PG is a decent vector for executing code.

I don't really buy that in case pof postgres. Normally, in a medium to
high security world, once you have RCE in postgres, the valuable data
can already be exfiltrated. And that's game over. The only real benefits
of a kernel vulnerabily is that that might allow to persist the attack
for longer - but there's already plenty you can do inside postgres, once
you have RCE.


> At a minimum I would urge the community to look at adding high risk
> syscalls to the kill list, systemd has some predefined sets we can
> pick from like @obsoluete, @cpu-emulation, @privileged, @mount, and
> @module.

I can see some small value in disallowing these - but we're back to the
point where that is better done one layer *above* postgres, by a process
with more privileges than PG. Because then a PG RCE doesn't have a way
to prevent those filters from being applied.


> The postmaster and per-role lists can further reduce the available
> syscalls based on the exact extensions and PLs being used.

I don't buy that per-session configurable lists buy you anything
meaningful. With an RCE in one session, it's pretty trivial to corrupt
shared memory to also trigger RCE in other sessions. And there's no way
seccomp or anything like that will prevent that.

An additional reason I'm quite sceptical about more fine grained
restrictions is that I think we're going to have to go for some use of
threading in the next few years. While I think that's still far from
agreed upon, I think there's a pretty large number of "senior" hackers
that see this as the future.  You can have per-thread seccomp filters,
but that's so trivial to circumvent (just overwrite some vtable like
data in another thread's data, and have it call whatever gadget you
want), that it's not even worth considering.

Greetings,

Andres Freund

Re: RFC: seccomp-bpf support

Reply via email to