Re: [Qemu-devel] [RFC] [PATCHv2 2/2] Adding basic calls to libseccomp in vl.c

Blue Swirl Fri, 29 Jun 2012 13:01:13 -0700

On Fri, Jun 29, 2012 at 3:27 PM, Corey Bryant <cor...@linux.vnet.ibm.com> wrote:
>
>
> On 06/28/2012 03:49 PM, Blue Swirl wrote:
>>
>> On Wed, Jun 27, 2012 at 9:25 PM, Anthony Liguori <anth...@codemonkey.ws>
>> wrote:
>>>
>>> On 06/21/2012 03:04 AM, Avi Kivity wrote:
>>>>
>>>>
>>>> On 06/19/2012 09:58 PM, Blue Swirl wrote:
>>>>>>>
>>>>>>>
>>>>>>> At least qemu-ifup/down scripts, migration exec and smbd have been
>>>>>>> mentioned. Only the system calls made by smbd (for some version of
>>>>>>> it)
>>>>>>> can be known. The user could specify arbitrary commands for the
>>>>>>> others, those could be assumed to use some common (large) subset of
>>>>>>> system calls but I think the security value would be close to zero
>>>>>>> then.
>>>>>>
>>>>>>
>>>>>>
>>>>>> We're not trying to protect against the user, but against the guest.
>>>>>>  If
>>>>>> we assume the user wrote those scripts with care so they cannot be
>>>>>> exploited by the guest, then we are okay.
>>>>>
>>>>>
>>>>>
>>>>> My concern was that first we could accidentally filter a system call
>>>>> that changes the script or executable behavior, much like sendmail +
>>>>> capabilities bug, and then a guest could trigger running this
>>>>> script/executable and exploit the changed behavior.
>>>>
>>>>
>>>>
>>>> Ah, I see.  I agree this is dangerous.  We should probably disable exec
>>>> if we seccomp.
>>>
>>>
>>>
>>> There's no great place to jump into this thread so I guess I'll do it
>>> here.
>>>
>>> There is absolutely no doubt that white-listing syscalls that we
>>> currently
>>> use provides an improvement in security.
>>>
>>> We need to assume:
>>>
>>> 1) QEMU is run as an unprivileged user
>>>
>>> 2) QEMU is already heavily restricted by SELinux
>>>
>>> In this case, seccomp() is not being used to replace MAC or DAC.  It's
>>> supplementing both of them by additionally filtering out syscalls that
>>> may
>>> have unknown kernel exploits in them.  That's all this initial effort is
>>> about. Since it's scope is so limited, we can simply enable it
>>> unconditionally too.
>>
>>
>> I don't think the scope is limited in a safe way. What is the set of
>> system calls that can't ever cause problems to any possible ifup/down
>> scripts, migration exec helpers and various versions of smbd?
>>
>> For example, unlink() is missing. What if the ifup/down script needs
>> it for lock file cleanup? ftruncate()? Every socket syscalls in case
>> LDAP is used to access user information by the libc?
>>
>> I think we can't define the safe set, except 'allow all'. I'd propose
>> one of the following to avoid breakage:
>>
>> 1. Allow all system calls for the initial patch, refactor later to
>> reduce the set. Useless until refactored.
>>
>> 2. Don't make seccomp mode enabled default, when enabled, forbid
>> execve(). Limits functionality when enabled, no security benefit if
>> not enabled.
>
>
> It should be noted that PR_SET_NO_NEW_PRIVS is set by default when the
> seccomp filter is enabled by libseccomp.  This prevents any new privileges
> from being granted on execve.


This is probably getting very hypothetical, but what happens if the
ifup/down scripts need to run a setuid/gid helper or a helper with
additional privileges from file system capabilities?

>
>
>>
>> 3. Before enabling seccomp, fork a helper process without restrictions
>> that is used to launch other programs. Needs some work.
>>
>>>
>>> After we have this initial support, then we can look at a -sandbox
>>> option.
>>>  This open could prevent things like open()/execve() but that will come
>>> at a
>>> cost of features.
>>>
>>> I think the reasonable thing to do for -sandbox is to basically focus on
>>> the
>>> set of syscalls that QEMU would use if it were launched under libvirt.
>>>  We
>>> should obviously make improvements (things like -blockdev) to make this
>>> even
>>> more restrictive.
>>>
>>> Who knows, maybe we end up having multiple types of sandboxes.  A
>>> '-sandbox
>>> libvirt' and a '-sandbox user' where the later is focused on the typical
>>> usage of an unprivileged user.
>>>
>>> But this is all stuff that can come later.  We solve a big problem by
>>> just
>>> getting the initial whitelist support in.
>>
>>
>> Fully agree, but we'd have to agree about what is a safe initial
>> whitelist.
>>
>>>
>>> Regards,
>>>
>>> Anthony Liguori
>>>
>>>
>>>>
>>>>>>
>>>>>> We have decomposed qemu to some extent, in that privileged operations
>>>>>> happen in libvirt.  So the modes make sense - qemu has no idea whether
>>>>>> a
>>>>>> privileged management system is controlling it or not.
>>>>>
>>>>>
>>>>>
>>>>> So with -seccomp, libvirt could tell QEMU that for example open(),
>>>>> execve(), bind() and connect() will never be needed?
>>>>
>>>>
>>>>
>>>> Yes.
>>>>
>>>
>>
>
> --
> Regards,
> Corey
>
>

Re: [Qemu-devel] [RFC] [PATCHv2 2/2] Adding basic calls to libseccomp in vl.c

Reply via email to