Hi, On 2019-08-28 14:23:00 -0400, Joshua Brindle wrote: > > or some similar technology where the filtering is done by logic > > that's outside the executable you wish to not trust. > > (After googling for libseccomp, I see that it's supposed to not > > allow syscalls to be turned back on once turned off, but that isn't > > any protection against this problem. An attacker who's found an ACE > > hole in Postgres can just issue ALTER SYSTEM SET to disable the > > feature, then force a postmaster restart, then profit.)
A postmaster restart might not be enough, because the postmaster's restrictions can't be removed, once in place. But all that's needed to circumvent that is force postmaster to die, and rely on systemd etc to restart it. > My preference would have been to enable it unconditionally but Joe was > being more practical. Well, the current approach is to configure the list of allowed syscalls in postgres. How would you ever secure that against the attacks described by Tom? As long as the restrictions are put into place by postgres itself, and as long they're configurable, such attacks are possible, no? And as long as extensions etc need different syscalls, you'll need configurability. > > I follow the idea of limiting the attack surface for kernel bugs, > > but this doesn't seem like a useful implementation of that, even > > ignoring the ease-of-use problems Peter mentions. > > Limiting the kernel attack surface for network facing daemons is > imperative to hardening systems. All modern attacks are chained > together so a kernel bug is useful only if you can execute code, and > PG is a decent vector for executing code. I don't really buy that in case pof postgres. Normally, in a medium to high security world, once you have RCE in postgres, the valuable data can already be exfiltrated. And that's game over. The only real benefits of a kernel vulnerabily is that that might allow to persist the attack for longer - but there's already plenty you can do inside postgres, once you have RCE. > At a minimum I would urge the community to look at adding high risk > syscalls to the kill list, systemd has some predefined sets we can > pick from like @obsoluete, @cpu-emulation, @privileged, @mount, and > @module. I can see some small value in disallowing these - but we're back to the point where that is better done one layer *above* postgres, by a process with more privileges than PG. Because then a PG RCE doesn't have a way to prevent those filters from being applied. > The postmaster and per-role lists can further reduce the available > syscalls based on the exact extensions and PLs being used. I don't buy that per-session configurable lists buy you anything meaningful. With an RCE in one session, it's pretty trivial to corrupt shared memory to also trigger RCE in other sessions. And there's no way seccomp or anything like that will prevent that. An additional reason I'm quite sceptical about more fine grained restrictions is that I think we're going to have to go for some use of threading in the next few years. While I think that's still far from agreed upon, I think there's a pretty large number of "senior" hackers that see this as the future. You can have per-thread seccomp filters, but that's so trivial to circumvent (just overwrite some vtable like data in another thread's data, and have it call whatever gadget you want), that it's not even worth considering. Greetings, Andres Freund