On Mon, Jun 18, 2012 at 02:55:35PM +0100, Daniel P. Berrange wrote: > On Mon, Jun 18, 2012 at 09:52:44AM -0400, Paul Moore wrote: > > On Monday, June 18, 2012 09:31:03 AM Daniel P. Berrange wrote: > > > On Fri, Jun 15, 2012 at 05:02:19PM -0400, Paul Moore wrote: > > > > On Friday, June 15, 2012 07:06:10 PM Blue Swirl wrote: > > > > > I think allowing execve() would render seccomp pretty much useless. > > > > > > > > Not necessarily. > > > > > > > > I'll agree that it does seem a bit odd to allow execve(), but there is > > > > still value in enabling seccomp to disable potentially buggy/exploitable > > > > syscalls. Let's not forget that we have over 300 syscalls on x86_64, not > > > > including the 32 bit versions, and even if we add all of the new > > > > syscalls > > > > suggested in this thread we are still talking about a small subset of > > > > syscalls. As far as security goes, the old adage of "less is more" > > > > applies. > > > > > > I can sort of see this argument, but *only* if the QEMU process is being > > > run under a dedicated, fully unprivileged (from a DAC pov) user, > > > completely > > > separate from anything else on the system. > > > > > > Or, of course, for a QEMU already confined by SELinux. > > > > Agreed ... and considering at least one major distribution takes this > > approach > > it seems like reasonable functionality to me. Confining QEMU, either > > through > > DAC and/or MAC, when faced with potentially malicious guests is just good > > sense. > > Good, I'm not missing anything then. I'd suggest that future iterations > of these patches explicitly mention the deployment scenarios in which > this technology is able to offer increases security, and also describe > the scenarios where it will not improve things.
Please correct me if I'm wrong here, but I don't understand how exactly whitelisting execve() is odd. The white list is inherit and passed along the child processes so they also need to have their own syscalls filtered by BPF in the kernel as stated in the Will's commit log[1] - "Filter programs will be inherited across fork/clone and execve." - I wonder if this is main point of your concern. Whitelisting execve() or not should be no difference from the security pov. However, I agree that a possible future feature could a customized whitelist for each child process spawned. But for a first instance, the default whitelist should be enough to start seccomp support in Qemu. Also, as far as I understand, seccomp never meant to replace any of the technologies above mentioned. Using more than one layer of protection (SELinux, AppArmor MAC policy and/or DAC) should always be a good practice for the defense in depth. [1] - http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=e2cfabdfd075648216f99c2c03821cf3f47c1727 -- Eduardo Otubo Software Engineer Linux Technology Center IBM Systems & Technology Group Mobile: +55 19 8135 0885 eot...@linux.vnet.ibm.com