Hi On Fri, May 1, 2020 at 8:29 PM Colin Walters <walt...@verbum.org> wrote: > > I'd like to make use of virtiofs as part of our tooling in > https://github.com/coreos/coreos-assembler > Most of the code runs as non-root today; qemu also runs as non-root. > We use 9p right now. > > virtiofsd's builtin sandboxing effectively assumes it runs as > root. > > First, change the code to use `clone()` and not `unshare()+fork()`. > > Next, automatically use `CLONE_NEWUSER` if we're running as non root. > > This is similar logic to that in https://github.com/containers/bubblewrap > (Which...BTW, it could make sense for virtiofs to depend on bubblewrap > and re-exec itself rather than re-implementing the containerization > itself) >
Now that systemd-nspawn works without privileges, isn't that also a solution? One that would fit both system and session level permissions, and integration with other services? > Signed-off-by: Colin Walters <walt...@verbum.org> > --- > tools/virtiofsd/passthrough_ll.c | 26 +++++++++++++++++++++----- > 1 file changed, 21 insertions(+), 5 deletions(-) > > diff --git a/tools/virtiofsd/passthrough_ll.c > b/tools/virtiofsd/passthrough_ll.c > index 4c35c95b25..468617f6d6 100644 > --- a/tools/virtiofsd/passthrough_ll.c > +++ b/tools/virtiofsd/passthrough_ll.c > @@ -2530,6 +2530,21 @@ static void print_capabilities(void) > printf("}\n"); > } > > +/* Copied from bubblewrap */ > +static int > +raw_clone(unsigned long flags, void *child_stack) > +{ > +#if defined(__s390__) || defined(__CRIS__) > + /* > + * On s390 and cris the order of the first and second arguments > + * of the raw clone() system call is reversed. > + */ > + return (int) syscall(__NR_clone, child_stack, flags); > +#else > + return (int) syscall(__NR_clone, flags, child_stack); > +#endif > +} > + > /* > * Move to a new mount, net, and pid namespaces to isolate this process. > */ > @@ -2547,14 +2562,15 @@ static void setup_namespaces(struct lo_data *lo, > struct fuse_session *se) > * an empty network namespace to prevent TCP/IP and other network > * activity in case this process is compromised. > */ > - if (unshare(CLONE_NEWPID | CLONE_NEWNS | CLONE_NEWNET) != 0) { > - fuse_log(FUSE_LOG_ERR, "unshare(CLONE_NEWPID | CLONE_NEWNS): %m\n"); > - exit(1); > + int clone_flags = SIGCHLD | CLONE_NEWPID | CLONE_NEWNS | CLONE_NEWNET; > + /* If we're non root, we need a new user namespace */ > + if (getuid() != 0) { > + clone_flags |= CLONE_NEWUSER; > } > > - child = fork(); > + child = raw_clone(clone_flags, NULL); > if (child < 0) { > - fuse_log(FUSE_LOG_ERR, "fork() failed: %m\n"); > + fuse_log(FUSE_LOG_ERR, "clone() failed: %m\n"); > exit(1); > } > if (child > 0) { > -- > 2.24.1 > > -- Marc-André Lureau