On Wednesday, September 18, 2013 04:59:10 PM Daniel P. Berrange wrote: > On Wed, Sep 18, 2013 at 11:53:09AM -0400, Paul Moore wrote: > > On Wednesday, September 18, 2013 08:38:17 AM Daniel P. Berrange wrote: > > > Libvirt does not want to be in the business of creating seccomp syscall > > > filters for QEMU. As mentioned before, IMHO that places an unacceptable > > > burden on libvirt to know about the syscalls each a particular version > > > of QEMU requires for its operation. > > > > At a high level, I don't see how libvirt configuring and installing a > > syscall filter is substantially different from libvirt configuring and > > installing a network filter. > > The rules created for a network filter have no bearing or relation to > internal QEMU implementation details, as you have with syscalls, so > this isn't really a relevant comparison.
The rules created for a network filter are directly related to the details of the guest running inside of QEMU. From a practical point of view I see both network and syscall filtering as being dependent on the guest; the network filtering configuration can change as the guest's services change, the syscall filtering configuration can change as the QEMU functionality can change. > > Also, and I recognize this is diverting away from a topic most of > > qemu-devel is not interested in, what about libvirt-lxc? What about all > > of the other virtualization drivers supported by libvirt (granted, not > > all would be candidates for syscall filtering, but you get the idea). > > It isn't clear to me that syscall filtering is something that's relevant > for inclusion in libvirt-lxc. It seems like something that would be used > by apps running inside LXC containers directly. For all the same reasons that it makes sense to filter syscalls in QEMU, I think it makes sense to filter syscalls in libvirt-lxc. The fundamental concern is that the kernel presents are large attack surface in the way of syscalls, and it is extremely likely that any given container does not have a legitimate need to call into all of the syscalls the kernel presents to userspace; especially if you consider the recent approaches of using containers to ship/deploy single applications. Also, just in case there are some misconceptions floating around, loading a syscall filter in libvirt doesn't mean the individual container applications can't also load their own filter. When multiple syscall filters are present for a given process, all of the filters are evaluated and the most restrictive decision for a given syscall request "wins". > Libvirt has no knowledge of such apps or what rules they might require, so > can't make any kind of intelligent decision about syscall filtering for LXC. A perfectly valid point, but I also think of syscall filtering as allowing the host administrator the ability to reduce the attack surface of the host system/kernel from potentially malicious containers/applications without having to rely on these containers/applications to police themselves. > I really view seccomp as something that apps use directly themselves, not > something that a 3rd party process applies prior to launching the apps, > since the latter has far too much administrative burden IMHO. The seccomp filter functionality is definitely something that apps can use themselves, but to limit syscall filtering to just that use case is to miss out on other valid uses as well. As far as the burden is concerned, is users/administrators find it too difficult, there is nothing requiring them to use it, however, for those who are facing serious security risks in their deployments providing syscall filtering in libvirt might be a very welcome addition. -- paul moore security and virtualization @ redhat