Hi On Mon, May 4, 2020 at 4:27 PM Colin Walters <walt...@verbum.org> wrote: > > > > On Mon, May 4, 2020, at 10:07 AM, Marc-André Lureau wrote: > > > Now that systemd-nspawn works without privileges, isn't that also a > > solution? One that would fit both system and session level > > permissions, and integration with other services? > > This is a complex topic and one I should probably write up in the bubblewrap > README.md. Today for example for CoreOS, our build and CI processes run > inside OpenShift (Kubernetes) - we aren't running systemd inside our > containers.
Actually, I mean systemd-run (oops!) > > bubblewrap is a small self-contained C wrapper around the container system > calls basically. In contrast, AFAICS right now, nspawn requires systemd - > which won't work for our use case. > > Really the contention point here is systemd's dependency on cgroups for > process tracking; in a "nested containerization" scenario you often just want > the cgroups from the "outer" container to apply. But having nested > mounts/pid namespaces are still very useful. (That said, cgroups v2 allows > sane nesting, but we aren't there yet) > > Also related is https://github.com/kubernetes/enhancements/issues/127 - > without that one requires privileged containers to do nesting. > > Now honestly, probably an even easier fix is `virtiofsd --disable-sandboxing` > because we fully trust the code running in these VMs. > > Or to directly respond again to your proposal: systemd-nspawn as an option > may work for some cases but won't for mine (I don't want virtiofsd/qemu > instances to "escape" the build container or run separately). > You can run within your parent slice, and even more conveniently with: https://github.com/systemd/systemd/pull/15362 -- Marc-André Lureau