On Mon, May 23, 2011 at 6:56 PM, Kevin Wolf <kw...@redhat.com> wrote: > Am 23.05.2011 17:24, schrieb Markus Armbruster: >> Kevin Wolf <kw...@redhat.com> writes: >> >>> Am 20.05.2011 21:53, schrieb Blue Swirl: >>>> On Fri, May 20, 2011 at 10:42 PM, Anthony Liguori <anth...@codemonkey.ws> >>>> wrote: >>>>> On 05/20/2011 02:25 PM, Blue Swirl wrote: >>>>>> >>>>>> On Fri, May 20, 2011 at 9:48 PM, Corey Bryant<brynt...@us.ibm.com> >>>>>> wrote: >>>>>>> >>>>>>> sVirt provides SELinux MAC isolation for Qemu guest processes and their >>>>>>> corresponding resources (image files). sVirt provides this support >>>>>>> by labeling guests and resources with security labels that are stored >>>>>>> in file system extended attributes. Some file systems, such as NFS, do >>>>>>> not support the extended attribute security namespace, which is needed >>>>>>> for image file isolation when using the sVirt SELinux security driver >>>>>>> in libvirt. >>>>>>> >>>>>>> The proposed solution entails a combination of Qemu, libvirt, and >>>>>>> SELinux patches that work together to isolate multiple guests' images >>>>>>> when they're stored in the same NFS mount. This results in an >>>>>>> environment where sVirt isolation and NFS image file isolation can both >>>>>>> be provided. >>>>>> >>>>>> Very nice. QEMU should use this to support privilege separation. We >>>>>> already have chroot and runas switches, a new switch should convert >>>>>> all file references to fd references internally for that process. If >>>>>> this can be made transparent, this should even be the default way of >>>>>> operation. >>>>> >>>>> You mean, QEMU starts up, opens all disk images, reinvokes itself in a >>>>> confined context, and then passes fds to the child? >>>> >>>> And exit after that, or do the same without forking. >>>> >>>> This wouldn't work now for the native CDROM devices which need to >>>> reopen the device. For that, an explicit reopen method could be added. >>>> The method could even chat with the privileged process to get that to >>>> do the reopening, but I'd leave that to libvirt and fail without it >>>> for plain QEMU. >>> >>> There are more cases where we reopen the image file. One example is the >>> 'commit' monitor command which temporarily reopens the backing file r/w. >>> Or Christoph's patch that allows guests to toggle the write-cache >>> enabled bit. Same for live snapshots. So we'll need a solution for them >>> before doing anything like this. >>> >>> And breaking qemu without libvirt isn't really an option for me. >> >> Reopening files is evil. Sometimes flaws in the system call API make it >> the only option. You can mitigate via /dev/fd/%d, but only on some >> systems. The less we reopen, the better. >> >> An fd: protocol can't easily support reopen. So fail it. This doesn't >> break any existing usage. It's just a restriction on the new protocol. >> Restrictions can render the new protocol useless in practice, but we're >> not "breaking qemu without libvirt" there. >> >> Perhaps we can make relax the restriction on some system by avoiding the >> reopen in a system-dependent way. > > Right, you only get the regression once libvirt starts using it (or even > worse, qemu itself, like Blue Swirl suggested). Doesn't make it much better.
One solution could be to add new commands which supply fresh fds while performing the operation which needs reopening: commit_fd ide0-hd0 #33 change ide1-cd0 fd:34 I don't remember the syntax for fd passing, so that part may be weird.