Quoting Eric W. Biederman (ebied...@xmission.com): > "Daniel P. Berrange" <berra...@redhat.com> writes: > > > On Thu, Jul 05, 2012 at 06:49:06PM -0700, Eric W. Biederman wrote: > >> Serge Hallyn <serge.hal...@canonical.com> writes: > >> > >> > Quoting Daniel P. Berrange (berra...@redhat.com): > >> >> On Thu, Jul 05, 2012 at 03:00:26PM +0100, Daniel P. Berrange wrote: > >> >> > Now, when using 'nova volume-attach': > >> >> > > >> >> > # nova volume-attach 05eb16df-03b8-451b-85c1-b838a8757736 > >> >> > a5ad1d37-aed0-4bf6-8c6e-c28543cd38ac /dev/sdf > >> >> > > >> >> > nova will import an iSCSI LUN from the nova volume service, on the > >> >> > compute > >> >> > node. The kernel will assign it the next free SCSI drive letter, in my > >> >> > case '/dev/sdc'. > >> >> > > >> >> > The libvirt nova driver will then do a mknod, using the volume name > >> >> > passed to 'nova volume-attach'. > >> >> > eg it will do > >> >> > > >> >> > mknod /var/lib/nova/instances/instance-0000000e/rootfs/dev/sdf > >> >> > >> >> Opps, I'm slightly wrong here. What it actually does is > >> >> > >> >> mount --bind /dev/sdc > >> >> /var/lib/nova/instances/instance-0000000e/rootfs/dev/sdf > >> >> > >> >> so you get a 'sdf' device, but with the major/minor number of the 'sdc' > >> >> device. I can't say I particularly like this approach. Ultimately I > >> >> think we need the kernel support to make this work correctly. In any > >> > > >> > Yes, that's what the 'devices namespace' is meant to address. I'm hoping > >> > we can some serious design discussion on that in the next few months. > >> > >> This is not the device namespace problem. > >> > >> This is the setns problem for mount namespaces, and the unprivilged > >> mount problem. > >> > >> There may be a notification issue so use space can perform actions > >> in a container when a device shows up. > >> > >> But it should be very possible on the host to call. > >> setns(containers_mount_namespace); > >> mknod("/dev/foo"); > >> chown("/dev/foo", CONTAINER_ROOT_UID, CONTAINER_ROOT_GID); > >> > >> And then from inside the container especially when I get the rest of > >> the user namespace merged it should be very possible to manipulate > >> the block device because you have permission, and to mount the > >> partitions of the block device, because you are root in your container. > >> > >> But until the user namespace is merged you really are root so you can > >> mount whatever. > >> > >> Daniel does that sound like the support you are looking for? > > > > Yes, the setns(mnt) approach you describe above is exactly what I'd > > like to be able todo, to solve the first half of the problem. > > > > The part of the problem is that I have a /dev/sdf, or even a > > /dev/volgroup00/logvol3 in the host (with whatever major:minor > > number that implies), and I want to be able to make it always > > appear as /dev/sda in the container (with the correspondingly > > different major:minor number). I'm guessing this is what Serge > > was refering to as the 'device' namespace problem
Right. > Getting the device to always appear with the name /dev/sda is easy. It's easy to log in and make it look that way. It's not easy to make all distros see it that way across boot. > Where does the need to have a specific device come from? I would have > thought by now that hotplug had been around long enough that in general > user space would not care. Yes the *primary* need for the devices namespace is to prevent udev storm in the host and send uevents to the right place, and macvtap and loop devices. > The only case that I know of where keeping the same device number seems > reasonable is in the case of live migration an application, in order to > avoid issues with stat changing for the same file over the transition, > and I think a synthesized hotplug event could probably handle that case. > > Is there another case besides buggy applications that have hard > coded device numbers that need specific device numbers? Other cases where specific device maj-min numbers are important are things like makedev. There is lots of software, and especially automatic update software, which insists that things have specific 'correct' maj-minor numbers. FWIW my (presumably naive) view is that for each non-init devicens we'd have a list of type-major:minor::type2-major:minor2 (:: meaning maps-to). Then if a uevent comes through not aimed at any type2-major2:minor2 valid in the namespace, that ns doesn't get the uevent. -serge _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp