Re: [Qemu-devel] [RFC] [PATCHv2 2/2] Adding basic calls to libseccomp in vl.c

Blue Swirl Tue, 19 Jun 2012 11:59:38 -0700

On Tue, Jun 19, 2012 at 11:04 AM, Avi Kivity <a...@redhat.com> wrote:
> On 06/16/2012 09:46 AM, Blue Swirl wrote:
>> On Fri, Jun 15, 2012 at 9:36 PM, Paul Moore <pmo...@redhat.com> wrote:
>>> On Friday, June 15, 2012 09:23:46 PM Blue Swirl wrote:
>>>> On Fri, Jun 15, 2012 at 9:02 PM, Paul Moore <pmo...@redhat.com> wrote:
>>>> > On Friday, June 15, 2012 07:06:10 PM Blue Swirl wrote:
>>>> >> I think allowing execve() would render seccomp pretty much useless.
>>>> >
>>>> > Not necessarily.
>>>> >
>>>> > I'll agree that it does seem a bit odd to allow execve(), but there is
>>>> > still value in enabling seccomp to disable potentially buggy/exploitable
>>>> > syscalls. Let's not forget that we have over 300 syscalls on x86_64, not
>>>> > including the 32 bit versions, and even if we add all of the new syscalls
>>>> > suggested in this thread we are still talking about a small subset of
>>>> > syscalls.  As far as security goes, the old adage of "less is more"
>>>> > applies.
>>>>
>>>> The helper program being executed could need any of the 300 system
>>>> calls, so we'd have to allow all.
>>>
>>> Don't we have some basic understanding of what the applications being exec'd
>>> will need to do?  I sorta see your point, but allowing the entire set of
>>> syscalls seems a bit dramatic.
>>
>> At least qemu-ifup/down scripts, migration exec and smbd have been
>> mentioned. Only the system calls made by smbd (for some version of it)
>> can be known. The user could specify arbitrary commands for the
>> others, those could be assumed to use some common (large) subset of
>> system calls but I think the security value would be close to zero
>> then.
>
> We're not trying to protect against the user, but against the guest.  If
> we assume the user wrote those scripts with care so they cannot be
> exploited by the guest, then we are okay.


My concern was that first we could accidentally filter a system call
that changes the script or executable behavior, much like sendmail +
capabilities bug, and then a guest could trigger running this
script/executable and exploit the changed behavior.

>
> However I agree with you that it would be better to restrict those
> syscalls.  The scripts are already unnecessary if using a management
> system and migration supports passed file descriptors, so that leaves
> only smbd, which can probably be pre-execed.

File descriptor passing could also work for smbd.

>
>>
>>>
>>>> > Protecting against the abuse and misuse of execve() is something that is
>>>> > better done with the host's access controls (traditional DAC, MAC via the
>>>> > LSM, etc.).
>>>>
>>>> How about seccomp mode selected by command line switch -seccomp, in
>>>> which bind/connect/open/execve are forbidden? The functionality
>>>> remaining would be somewhat limited (can't migrate or use SMB etc.
>>>> until refactoring of QEMU), but that way seccomp jail would be much
>>>> tighter.
>>>
>>> When I spoke to Anthony about this earlier (offline, sorry) he was opposed 
>>> to
>>> requiring any switches or user interaction to enable seccomp.  I'm not sure 
>>> if
>>> his stance on this has changed any over the past few months.
>>
>> There could be two modes, strict mode (-seccomp) and default mode
>> (only some syscalls blocked). With the future decomposed QEMU, strict
>> seccomp mode would be default and the switch would be obsoleted. If
>> the decomposition is planned to happen soonish, adding the switch
>> would be just churn.
>
> We have decomposed qemu to some extent, in that privileged operations
> happen in libvirt.  So the modes make sense - qemu has no idea whether a
> privileged management system is controlling it or not.

So with -seccomp, libvirt could tell QEMU that for example open(),
execve(), bind() and connect() will never be needed?

>
>>
>>>
>>> In my perfect world, we would have a decomposed QEMU that functions as a
>>> series of processes connected via some sort of IPC; the exact divisions are 
>>> a
>>> bit TBD and beyond the scope of this discussion.  In this scenario we would 
>>> be
>>> able to restrict QEMU with sVirt and seccomp to a much higher degree than we
>>> could with the current monolithic QEMU.
>>>
>>> I don't expect to see my perfect world any time soon, but in the meantime we
>>> can still improve the security of QEMU on Linux with these seccomp patches 
>>> and
>>> for that reason I think it's a win.  Since these patches don't expose 
>>> anything
>>> at runtime (no knobs, switches, etc.) we leave ourselves plenty of 
>>> flexibility
>>> for changing things in the future.
>>
>> Yes, I'm much in favor of adding seccomp support soon. But I just
>> wonder if this is really the best level of security we can reach now,
>> not assuming decomposed QEMU, but just minor tweaks?
>
> We might disable mprotect(PROT_EXEC) if running with kvm.
>
> --
> error compiling committee.c: too many arguments to function
>
>

Re: [Qemu-devel] [RFC] [PATCHv2 2/2] Adding basic calls to libseccomp in vl.c

Reply via email to