On Wed, May 14, 2014 at 10:17:31PM -0400, Michael H. Warfield wrote: > > > Using devtmpfs is one possible > > > solution, and it would have the added benefit of making container setup > > > simpler. But simply letting containers mount devtmpfs isn't sufficient > > > since the container may need to see a different, more limited set of > > > devices, and because different environments making modifications to > > > the filesystem could lead to conflicts. > > > > > > This series solves these problems by assigning devices to user > > > namespaces. Each device has an "owner" namespace which specifies which > > > devtmpfs mount the device should appear in as well allowing priveleged > > > operations on the device from that namespace. This defaults to > > > init_user_ns. There's also an ns_global flag to indicate a device should > > > appear in all devtmpfs mounts. > > > I'd strongly argue that this isn't even a "problem" at all. And, as I > > said at the Plumbers conference last year, adding namespaces to devices > > isn't going to happen, sorry. Please don't continue down this path. > > I was just mentioning that to Serge just a week or so ago reminding him > of what you told all of us face to face back then. We were having a > discussion over loop devices into containers and this topic came up.
It was the loop device use case that got me started down this path in the first place, so I don't personally have any interest in physical devices right now (though I was sure others would). As things stand today, to support loop devices lxc would need to do something like this: grab some unused loop devices, remove them from /dev, and make device nodes with appropriate ownership/permissions in the container's /dev. Otherwise there's potential for accidental duplicate use of the devices, which besides having unexpected results could result in information leak into the container. At that point you have some loop devices that the container can use, but privileged operations such as re-reading partitions and encrypted loop aren't possible. Even if you can re-read partitions device nodes will appear in the main /dev and not in the container. With these patches the container could mount devtmpfs, and since loop-control is global it would appear in the mount. The LOOP_CTL_GET_FREE ioctl can be used to get an unused loop device which will owned by the container's user namespace, so it will only appear in that container's devtmpfs mount. Privileged operations would be allowed on the loop device by root in the namespace, and if partition devices were created they would inherit the namespace from the parent and thus show up in the container's devtmpfs mount. I think this use case demonstrates some real problems with only half-way solutions atm. I'm certainly open to other suggestions about how to solve them. Thanks, Seth -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/