On Mon, May 4, 2020, at 10:07 AM, Marc-André Lureau wrote:

> Now that systemd-nspawn works without privileges, isn't that also a
> solution? One that would fit both system and session level
> permissions, and integration with other services?

This is a complex topic and one I should probably write up in the bubblewrap 
README.md.  Today for example for CoreOS, our build and CI processes run inside 
OpenShift (Kubernetes) - we aren't running systemd inside our containers.

bubblewrap is a small self-contained C wrapper around the container system 
calls basically.  In contrast, AFAICS right now, nspawn requires systemd - 
which won't work for our use case.

Really the contention point here is systemd's dependency on cgroups for process 
tracking; in a "nested containerization" scenario you often just want the 
cgroups from the "outer" container to apply.  But having nested mounts/pid 
namespaces are still very useful.  (That said, cgroups v2 allows sane nesting, 
but we aren't there yet)

Also related is https://github.com/kubernetes/enhancements/issues/127 - without 
that one requires privileged containers to do nesting.

Now honestly, probably an even easier fix is `virtiofsd --disable-sandboxing` 
because we fully trust the code running in these VMs.

Or to directly respond again to your proposal: systemd-nspawn as an option may 
work for some cases but won't for mine (I don't want virtiofsd/qemu instances 
to "escape" the build container or run separately).


Reply via email to