On 01/05/2011 03:37 AM, Daniel Lezcano wrote:
On 01/05/2011 08:53 AM, Rob Landley wrote:On 01/04/2011 06:52 AM, Daniel Lezcano wrote:On 01/04/2011 09:36 AM, Rob Landley wrote:I'm attempting to write a simple HOWTO for setting up a container with LXC. Unfortunately, console handling is really really brittle and the only way I've gotten it to work is kind of unpleasant to document.Using lxc 0.7.3 (both in debian sid and built from source myself), I can lxc-create a container, and when I run lxc-start it launches init in the container. But the console is screwy. If my init program is just a command shell, the first key I type will crash lxc-start with an I/O error. (Wrapping said shell with a script to redirect stdin/stdout/stderr to various /dev character devices doesn't seem to improve matters.) Using the busybox template and the busybox-i686 binary off of busybox.net, it runs init and connects to the various tty devices, and this somehow prevents lxc-start from crashing. But if I "press enter to active this console" like it says, the resulting shell prompt is completely unusable. If I'm running from an actual TTY device, then some of the keys I type go to the container and some don't. If my console is connected to a PTY when I run lxc-start (such as if I ssh in and run lxc-start from the ssh session), _none_ of the characters I type go to the shell prompt. To get a usable shell prompt in the container, what I have to do is lxc-start in one window, ssh into the server to get a fresh terminal, and then run lxc-console in that second terminal. That's the only magic sequence I've found so far that works.Hmm, right. I was able to reproduce the problem.I've got two more. (Here's another half-finished documentation file, attached, which may help with the reproduction sequence.) I'm running a KVM instance to host the containers, and I've fed it an e1000 interface as eth0 with the normal -net user, and a tun/tap device on eth1 with 192.168.254.1 associated at the other end. Inside KVM, I'm using this config to set up a container: lxc.utsname = busybox lxc.network.type = phys lxc.network.flags = up lxc.network.link = eth1 #lxc.network.name = eth0 And going: lxc-start -n busybox -f busybox.conf -t busybox Using that (last line of the config intentionally commented out for the moment) I get an eth1 in the container that is indeed the eth1 on the host system (which is a tun/tap device I fed to kvm as a second e1000 device). That's the non-bug behavior. Bug #1: If I exit that container, eth1 vanishes from the world. The container's gone, but it doesn't reappear on the host. (This may be related to the fact that the only way I've found to kill a container is do "killall -9 lxc-start". For some reason a normal kill of lxc-start is ignored. However, this still shouldn't leak kernel resources like that.)It is related to the kernel behavior : netdev with a rtnl_link_ops will be automatically deleted when a network namespace is destroyed. The full answer is at net/core/dev.c :
Um, default_device_exit_batch() maybe? (6000 line file there...)Unfortunately I can't rmmod a statically linked driver, and if you've got two e1000 devices in the system and are still _using_ one in the host that's the wrong granularity level to re-probe at anyway.
If lxc-start could be killed by something other than -9 it could move the device back to the host context on the way out. (Although really, the kernel should either retain the interface or provide a way to re-probe it. As it is, in my setup I can't figure out how to relaunch a container using a physical network device without rebooting the host.)
Is there a todo list for LXC? The lxc-development page doesn't link to a bugzilla...
Bug #2: When I uncomment that last line of the above busybox.conf, telling it to move eth1 into the container but call it "eth0" in there, suddenly the eth0 in the container gets entangled with the eth0 on the host, to the point where dhcp gives it an address. (Which is 10.0.2.16. So it's talking to the VPN that only the host's eth0 should have access to, but it's using a different mac address. Oddly, the host eth0 still seems to work fine, and the two IP addresses can ping each other across the container interface.) This is still using the most recent release version.What is the kernel version ?
2.6.37-rc8, vanilla linus tree. (I applied some NFS test patches, but haven't mounted NFS this boot so it shouldn't apply.)
Attached is an updated version of my first documentation file that includes the kernel configuration info in step 2.
Rob
To play around with containers, I chose to use a 3 layer approach:
Laptop - the host system running on real hardware (my Ubuntu laptop).
KVM - a virtual debian Sid system running under KVM.
Container - a simple busybox-based system running in a container.
So "Laptop" hosts "KVM" which hosts "Container".
The advantage of this approach is we can modify and repeatedly reboot the KVM system without interfering with the host laptop. We can also play with things like network routing without disconnecting the laptop from the internet.
Step 1: Create a root filesystem for the KVM system.
Here's how to creates a debian "sid" (unstable) root filesystem and package it into an 8 gigabyte ext3 image. The root password is "root". If you prefer a different root filesystem, feel free to use that instead. This procedure requires the "debootstrap", "genext2fs", and "e2fsprogs" packages installed.
This creates a smaller image and resizes it because genext2fs is extremely slow at creating large images.
You'll have to run this stage as root, and it requires network access. The remaining stages do not require root access.
sudo debootstrap sid sid echo -e "root\nroot" | chroot sid passwd echo -e "auto lo\niface lo inet loopback\nauto eth0\niface eth0 inet dhcp" \ > sid/etc/network/interfaces ln -sf vimrc sid/etc/vimrc.tiny rm -f sid/etc/udev/rules.d/70-persistent-net.rules echo kvm > sid/etc/hostname echo cgroup /mnt/cgroup cgroup defaults >> sid/etc/fstab mkdir -p sid/mnt/cgroup BLOCKS=$(((1024*$(du -m -s sid | awk '{print $1}')*12)/10)) genext2fs -z -d sid -b $BLOCKS -i 1024 sid.ext3 resize2fs sid.ext3 8G tune2fs -j -c 0 -i 0 sid.ext3
Now chown the "sid.ext3" file to your normal (non-root) user, and switch back to that user. (If you forget to chown, the emulated system won't be able to write to the ext3 file and will complain about write errors when you fire up KVM. Use your username instead of mine here.)
chown landley:landley sid.ext3 exit # Stop being root on Laptop now
Step 2: Build a kernel for KVM, with container support.
The defconfig in 2.6.36 is close to a usable configuration, but needs a few more symbols switched on:
# Start with the default configuration make defconfig # Add /dev/hda and more container support. cat >> .config << EOF CONFIG_IDE=y CONFIG_IDE_GD=y CONFIG_IDE_GD_ATA=y CONFIG_BLK_DEV_PIIX=y CONFIG_CGROUP_DEVICE=y CONFIG_CGROUP_MEM_RES_CTLR=y CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED=y CONFIG_BLK_CGROUP=y CONFIG_DEVPTS_MULTIPLE_INSTANCES=y EOF yes '' | make oldconfig # Build kernel (counting CPUS to supply appropriate -j to make) CPUS=$(grep "^processor" /proc/cpuinfo | wc -l) make -j $CPUS
This builds a (mostly) static kernel, because rebooting kvm with a new kernel image is trivial, but copying modules into a loopback mounted root filesystem image is a multi-step process requiring root access.
Step 3: Boot the result under QEMU or KVM, and add more packages.
This invocation boots the newly built kernel with the sid root filesystem image, configured to exit the emulator when the virtual system shuts down. It allocates 1 gigabyte of memory and provides a virtual gigabit network interface hooked up to a virtual masquerading router (for the 10.0.2.X address range), with port 9876 on the host's loopback interface forwarded to the SSH port on the emulated interface.
kvm -m 1024 -kernel arch/x86/boot/bzImage -no-reboot -hda ~/sid.ext3 \ -append "root=/dev/hda rw panic=1" -net nic,model=e1000 -net user \ -redir tcp:9876::22
Log in to the resulting system (user root password root), and install some more packages to fluff out the SID install a bit.
aptitude update aptitude install file psmisc less strace bzip2 make gcc libc6-dev dropbear lxc
Step 4: ssh into the KVM instance.
The KVM/QEMU console window is a nice fallback, but awkward for serious use. To get multiple terminal windows, or use cut and paste, we need more.
Redirecting a port from the host's loopback interface to connect to the port of the KVM instance allows us to ssh in from the laptop system. In step 3, we installed the dropbear ssh server, and the "-redir tcp:9876::22" arguments we used to launch KVM forward port 9876 from the host's loopback interface to port 22 of KVM's eth0, so we should now be able to ssh in from the laptop system via:
ssh r...@127.0.0.1 -p 9876
Remember, root's password is "root". (Feel free to change it.)
Step 5: Set up a simple busybox-based container under the KVM system.
The lxc-create command sets up a container directory with a new root filesystem. It takes three arguments: a name for the new container directory, a root filesystem build script, and a configuration file describing things like what network devices to put in the new container.
LXC calls its root filesystem build scripts "templates" (see /usr/lib/lxc/templates), the simplest of which is the "busybox" template.
Unfortunately, the default busybox binary in Debian sid is insufficient. The "busybox" package doesn't include the "init" command, and the "busybox-static" package doesn't have "login". To work around this, we download a prebuilt busybox binary from the busybox website, and add the current directory to the $PATH so lxc-create can find it.
We supply a trivial configuration file defining no network devices, mostly to shut up the "are you really really sure" babysitting lxc-create would spew otherwise.
wget http://busybox.net/downloads/binaries/1.18.0/busybox-i686 -O busybox chmod +x busybox echo -e "lxc.utsname = container\nlxc.network.type = empty" > container.conf PATH=$(pwd):$PATH lxc-create -f container.conf -t busybox -n container
LXC creates the container's directory (including its config file and its root filesystem) under /var/lib/lxc.
Step 6: Launch the container
Launching containers requires the "cgroup" filesystem be mounted somewhere. (Doesn't matter where, LXC will check /proc/mounts to find it.) In step 1, we added an fstab entry to the KVM sid system to mount cgroup on /mnt/cgroup.
We also need the LXC command line tools, which we installed in step 3.
Now we get to experience the brittle bugginess that is LXC 0.7.3. The first step to launching an LXC container is:
lxc-start -n container
This starts busybox init in the container, which will tell you "press Enter to activate this console". Unfortunately, LXC's console handling code is buggy, and this console won't actually work. (Feel free to play with it, just don't expect to accomplish much.)
To get a working shell prompt in the container, ssh into the KVM system again and from that window type:
lxc-console -n container
This will connect to one of init's other consoles, which finally lets you log in (as root). Repeat: you hae to run lxc-start, leave it running, and run lxc-console in a second terminal in order to get a usable shell prompt.
Step 7: Stop the container, and the KVM system.
To kill the container, run this on the KVM system:
killall -9 lxc-start
(I don't know why lxc-start ignores everything but "kill -9". I think it's another bug.)
Note that killall undoes lxc-start. If you want to undo the lxc-create (delete the container from /var/lib/lxc), the command is:
lxc-destroy -n container
You can exit the KVM system by closing the QEMU console window, by hitting Ctrl-C in the terminal you ran KVM from, or by running "shutdown -r now" in the KVM system.
Summary
You should now be able to get a shell prompt in all three systems:
The host laptop.
The Debian sid KVM.
The busybox container.
Next time, we set up networking in the container.
------------------------------------------------------------------------------ Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel