Quoting Andy Lutomirski (l...@amacapital.net): > On May 15, 2014 1:26 PM, "Serge E. Hallyn" <se...@hallyn.com> wrote: > > > > Quoting Richard Weinberger (rich...@nod.at): > > > Am 15.05.2014 21:50, schrieb Serge Hallyn: > > > > Quoting Richard Weinberger (richard.weinber...@gmail.com): > > > >> On Thu, May 15, 2014 at 4:08 PM, Greg Kroah-Hartman > > > >> <gre...@linuxfoundation.org> wrote: > > > >>> Then don't use a container to build such a thing, or fix the build > > > >>> scripts to not do that :) > > > >> > > > >> I second this. > > > >> To me it looks like some folks try to (ab)use Linux containers > > > >> for purposes where KVM would much better fit in. > > > >> Please don't put more complexity into containers. They are already > > > >> horrible complex > > > >> and error prone. > > > > > > > > I, naturally, disagree :) The only use case which is inherently not > > > > valid for containers is running a kernel. Practically speaking there > > > > are other things which likely will never be possible, but if someone > > > > offers a way to do something in containers, "you can't do that in > > > > containers" is not an apropos response. > > > > > > > > "That abstraction is wrong" is certainly valid, as when vpids were > > > > originally proposed and rejected, resulting in the development of > > > > pid namespaces. "We have to work out (x) first" can be valid (and > > > > I can think of examples here), assuming it's not just trying to hide > > > > behind a catch-22/chicken-egg problem. > > > > > > > > Finally, saying "containers are complex and error prone" is conflating > > > > several large suites of userspace code and many kernel features which > > > > support them. Being more precise would, if the argument is valid, > > > > lend it a lot more weight. > > > > > > We (my company) use Linux containers since 2011 in production. First LXC, > > > now libvirt-lxc. > > > To understand the internals better I also wrote my own userspace to > > > create/start > > > containers. There are so many things which can hurt you badly. > > > With user namespaces we expose a really big attack surface to regular > > > users. > > > I.e. Suddenly a user is allowed to mount filesystems. > > > > That is currently not the case. They can mount some virtual filesystems > > and do bind mounts, but cannot mount most real filesystems. This keeps > > us protected (for now) from potentially unsafe superblock readers in the > > kernel. > > > > > Ask Andy, he found already lots of nasty things... > > I don't think I have anything brilliant to add to this discussion > right now, except possibly: > > ISTM that Linux distributions are, in general, vulnerable to all kinds > of shenanigans that would happen if an untrusted user can cause a > block device to appear. That user doesn't need permission to mount it
Interesting point. This would further suggest that we absolutely must ensure that a loop device which shows up in the container does not also show up in the host. > or even necessarily to change its contents on the fly. > > E.g. what happens if you boot a machine that contains a malicious disk > image that has the same partition UUID as /? Nothing good, I imagine. > > So if we're going to go down this road, we really need some way to > tell the host that certain devices are not trusted. > > --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/