I'd like to make use of virtiofs as part of our tooling in https://github.com/coreos/coreos-assembler Most of the code runs as non-root today; qemu also runs as non-root. We use 9p right now.
virtiofsd's builtin sandboxing effectively assumes it runs as root. First, change the code to use `clone()` and not `unshare()+fork()`. Next, automatically use `CLONE_NEWUSER` if we're running as non root. This is similar logic to that in https://github.com/containers/bubblewrap (Which...BTW, it could make sense for virtiofs to depend on bubblewrap and re-exec itself rather than re-implementing the containerization itself) Signed-off-by: Colin Walters <walt...@verbum.org> --- tools/virtiofsd/passthrough_ll.c | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c index 4c35c95b25..468617f6d6 100644 --- a/tools/virtiofsd/passthrough_ll.c +++ b/tools/virtiofsd/passthrough_ll.c @@ -2530,6 +2530,21 @@ static void print_capabilities(void) printf("}\n"); } +/* Copied from bubblewrap */ +static int +raw_clone(unsigned long flags, void *child_stack) +{ +#if defined(__s390__) || defined(__CRIS__) + /* + * On s390 and cris the order of the first and second arguments + * of the raw clone() system call is reversed. + */ + return (int) syscall(__NR_clone, child_stack, flags); +#else + return (int) syscall(__NR_clone, flags, child_stack); +#endif +} + /* * Move to a new mount, net, and pid namespaces to isolate this process. */ @@ -2547,14 +2562,15 @@ static void setup_namespaces(struct lo_data *lo, struct fuse_session *se) * an empty network namespace to prevent TCP/IP and other network * activity in case this process is compromised. */ - if (unshare(CLONE_NEWPID | CLONE_NEWNS | CLONE_NEWNET) != 0) { - fuse_log(FUSE_LOG_ERR, "unshare(CLONE_NEWPID | CLONE_NEWNS): %m\n"); - exit(1); + int clone_flags = SIGCHLD | CLONE_NEWPID | CLONE_NEWNS | CLONE_NEWNET; + /* If we're non root, we need a new user namespace */ + if (getuid() != 0) { + clone_flags |= CLONE_NEWUSER; } - child = fork(); + child = raw_clone(clone_flags, NULL); if (child < 0) { - fuse_log(FUSE_LOG_ERR, "fork() failed: %m\n"); + fuse_log(FUSE_LOG_ERR, "clone() failed: %m\n"); exit(1); } if (child > 0) { -- 2.24.1