On Thu, Mar 1, 2018 at 9:44 AM, Andy Lutomirski <l...@amacapital.net> wrote: > On Wed, Feb 28, 2018 at 7:56 PM, Daniel Borkmann <dan...@iogearbox.net> wrote: >> On 02/28/2018 12:55 AM, chris hyser wrote: >>>> On 02/27/2018 04:58 PM, Daniel Borkmann wrote: >> On 02/27/2018 05:59 PM, >>>> chris hyser wrote: >>>>>> On 02/27/2018 11:00 AM, Kees Cook wrote: >>>>>>> On Tue, Feb 27, 2018 at 6:53 AM, chris hyser <chris.hy...@oracle.com> >>>>>>> wrote: >>>>>>>> On 02/26/2018 11:38 PM, Kees Cook wrote: >>>>>>>>> On Mon, Feb 26, 2018 at 8:19 PM, Andy Lutomirski <l...@amacapital.net> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> 3. Straight-up bugs. Those are exactly as problematic as verifier >>>>>>>>>> bugs in any other unprivileged eBPF program type, right? I don't see >>>>>>>>>> why seccomp is special here. >>>>>>>>> >>>>>>>>> My concern is more about unintended design mistakes or other feature >>>>>>>>> creep with side-effects, especially when it comes to privileges and >>>>>>>>> synchronization. Getting no-new-privs done correctly, for example, >>>>>>>>> took some careful thought and discussion, and I'm shy from how painful >>>>>>>>> TSYNC was on the process locking side, and eBPF has had some rather >>>>>>>>> ugly flaws in the past (and recently: it was nice to be able to say >>>>>>>>> for Spectre that seccomp filters couldn't be constructed to make >>>>>>>>> attacks but eBPF could). Adding the complexity needs to be worth the >>>>> >>>>> Well, not really. One part of all the Spectre mitigations that went >>>>> upstream >>>>> from BPF side was to have an option to remove interpreter entirely and >>>>> that >>>>> also relates to seccomp eventually. But other than that an attacker might >>>>> potentially find as well useful gadgets inside seccomp or any other code >>>>> that is inside the kernel, so it's not a strict necessity either. >>>>> >>>>>>>>> gain. I'm on board for doing it, I just want to be careful. :) >>>>>>>> >>>>>>>> Another option might be to remove c/eBPF from the equation all >>>>>>>> together. >>>>>>>> c/eBPF allows flexibility and that almost always comes at the cost of >>>>>>>> additional security risk. Seccomp is for enhanced security yes? How >>>>>>>> about a >>>>>>>> new seccomp mode that passes in something like a bit vector or hashmap >>>>>>>> for >>>>>>>> "simple" white/black list checks validated by kernel code, versus user >>>>>>>> provided interpreted code? Of course this removes a fair number of >>>>>>>> things >>>>>>>> you can currently do or would be able to do with eBPF. Of course, >>>>>>>> restated >>>>>>>> from a security point of view, this removes a fair number of things an >>>>>>>> _attacker_ can do. Presumably the performance improvement would also be >>>>>>>> significant. >>>>> >>>>> Good luck with not breaking existing applications relying on seccomp out >>>>> there. >>>> >>>> This wasn't in the context of an implementation proposal, but the >>>> assumption would be to add this in addition to the old way. Now, does that >>>> make sense to do? That is the discussion. >> >> I see; didn't read that out from the above when you also mentioned removing >> cBPF, but fair enough. >> >>>>>>>> Is this an idea worth prototyping? >>>>>>> >>>>>>> That was the original prototype for seccomp-filter. :) The discussion >>>>>>> around that from years ago basically boiled down to it being >>>>>>> inflexible. Given all the things people want to do at syscall time, >>>>>>> that continues to be true. So true, in fact, that here we are now, >>>>>>> trying to move to eBPF from cBPF. ;) >>>>> >>>>> Right, agree. cBPF is also pretty much frozen these days and aside from >>>>> that, seccomp/BPF also just uses a proper subset of it. I wouldn't mind >>>>> doing something similar for eBPF side as long as this is reasonably >>>>> maintainable and not making BPF core more complex, but most of it can >>>>> already be set in the verifier anyway based on prog type. Note, that >>>>> performance of seccomp/BPF is definitely a demand as well which is why >>>>> people still extend the old remaining cBPF JITs today such that it can >>>>> be JITed also from there. >>>>> >>>>>> I will try to find that discussion. As someone pointed out here though, >>>>>> eBPF is being used by more and more people in areas where security is >>>>>> not the primary concern. Differing objectives will make this a long term >>>>>> continuing issue. We ourselves were looking at eBPF simply as a means to >>>>>> use a hashmap for a white/blacklist, i.e. performance not flexibility. >>>>> >>>>> Not really, security of verifier and BPF infra in general is on the top >>>>> of the list, it's fundamental to the underlying concept and just because >>>>> it is heavily used also in tracing and networking, it only shows that the >>>>> concept is highly flexible that it can be applied in multiple areas. >>> >>> If you're implying that because seccomp would have it's own verifier and >>> could therefore restrict itself to a subset of eBPF, therefore any future >>> additions/features to eBPF would not necessarily make seccomp less secure, >>> I mainly agree. Is that the argument? >> >> Ok, in addition to the current unpriv restrictions imposed by the verifier, >> what additional requirements would you have from your side in order to get >> to semantics that make sense for you wrt seccomp/eBPF? Just trying to >> understand how far we are away from that. Note that not every new feature, >> map or helper is enabled for every program type of course. >> > > I haven't looked at the exact unpriv restrictions lately, but I think > I remember them. Regardless, from my perspective, here's what I would > care about as a seccomp reviewer: > > 1. No extra information should become available to a seccomp filter > program without seccomp's explicit opt-in. In other words, a seccomp > filter program should be able to access the fields in struct > seccomp_data, the return values of BPF_CALL helpers explicitly > authorized by seccomp, and the values in maps authorized by seccomp > (if any!), and that's it. They should not be able to learn the > current time, any kernel pointers, user register state (except that > which is contained in seccomp_data), etc. I believe that this is > already the case except insofar as core BPF_CALL helpers may violate > this. (I'm not sure exactly what the policy is on the use of BPF_CALL > helpers.) > The only calls that were whitelisted was uid/gid, pid/tid, ktime, and prandom. There was no printk, nor perf_events. It didn't give access to maps, or anything else. I have to ask though, why isn't it okay for eBPF filters to have access to time? My use case is that I launch a job, and it has 5 minutes to do some privileged operations, and then I want to deny any privileged operations. Maybe when writeable maps are a thing, we can do something more advanced, but trying to narrow the window of attack has a lot of benefit.
> 2. Filter evaluation should have no side effects except as explicitly > authorized by seccomp or by systemwide tracing. So perf observing > that a seccomp filter ran is fine, but seccomp filters should not be > able to write to the system log, to perf ring buffers, to maps, etc. > See above. > 3. Stability. If a filter passes verification on two different > kernels, it should behave the same on both kernels, even if the filter > is buggy in some theoretical sense. In the sense that BPF is part of the uapi, and uapi is supposed to be stable? I think that already makes sense, because for all the places where eBPF is used for networking, it has to follow this property. > > And that's it. Obviously the attack surface provided by the ability > to load and run a filter should be minimized, but that's true for eBPF > in general and has little to do with seccomp in particular. > > #1 and #2 are probably fairly straightforward using existing > mechanisms, unless the BPF_CALL hooks or map authorization hooks for > program types need to be extended a bit to get it right. #3 is maybe > more interesting, but I imagine that XDP and any upcoming bpf-based > iptables replacement have the same requirement. In contrast, bpf > *tracing* doesn't really require #3 to the same extent -- it's really > such an awful thing if a buggy or otherwise naughty bpf tracing > program behaves differently after a kernel upgrade. > > I suppose that another way of saying this is that an eBPF seccomp > program should behave like a pure function except to the extent that > the seccomp core makes an explicit exception. > > I'm not terribly concerned about the additional attack surface exposed > by eBPF itself. Sure, it's more dangerous to allow a sandboxed > program to load its own eBPF programs than to allow a sandboxed > program to load its own cBPF programs, but such is the price of > progress. If I'm writing a restrictive sandbox a la chromium's, I'm > not going to allow it to load eBPF programs, but I can still use eBPF > to enforce the sandbox policy. > > --Andy