Re: [lxc-devel] How does the console work in most recent release?

Rob Landley Wed, 05 Jan 2011 03:35:29 -0800

On 01/05/2011 03:37 AM, Daniel Lezcano wrote:

On 01/05/2011 08:53 AM, Rob Landley wrote:

On 01/04/2011 06:52 AM, Daniel Lezcano wrote:

On 01/04/2011 09:36 AM, Rob Landley wrote:

I'm attempting to write a simple HOWTO for setting up a container with
LXC. Unfortunately, console handling is really really brittle and the
only way I've gotten it to work is kind of unpleasant to document.


Using lxc 0.7.3 (both in debian sid and built from source myself), I
can lxc-create a container, and when I run lxc-start it launches init
in the container. But the console is screwy.

If my init program is just a command shell, the first key I type will
crash lxc-start with an I/O error. (Wrapping said shell with a script
to redirect stdin/stdout/stderr to various /dev character devices
doesn't seem to improve matters.)

Using the busybox template and the busybox-i686 binary off of
busybox.net, it runs init and connects to the various tty devices, and
this somehow prevents lxc-start from crashing. But if I "press enter
to active this console" like it says, the resulting shell prompt is
completely unusable. If I'm running from an actual TTY device, then
some of the keys I type go to the container and some don't. If my
console is connected to a PTY when I run lxc-start (such as if I ssh
in and run lxc-start from the ssh session), _none_ of the characters I
type go to the shell prompt.

To get a usable shell prompt in the container, what I have to do is
lxc-start in one window, ssh into the server to get a fresh terminal,
and then run lxc-console in that second terminal. That's the only
magic sequence I've found so far that works.


Hmm, right. I was able to reproduce the problem.


I've got two more. (Here's another half-finished documentation file,
attached, which may help with the reproduction sequence.)

I'm running a KVM instance to host the containers, and I've fed it an
e1000 interface as eth0 with the normal -net user, and a tun/tap
device on eth1 with 192.168.254.1 associated at the other end.

Inside KVM, I'm using this config to set up a container:

lxc.utsname = busybox
lxc.network.type = phys
lxc.network.flags = up
lxc.network.link = eth1
#lxc.network.name = eth0

And going:

lxc-start -n busybox -f busybox.conf -t busybox

Using that (last line of the config intentionally commented out for
the moment) I get an eth1 in the container that is indeed the eth1 on
the host system (which is a tun/tap device I fed to kvm as a second
e1000 device). That's the non-bug behavior.

Bug #1: If I exit that container, eth1 vanishes from the world. The
container's gone, but it doesn't reappear on the host. (This may be
related to the fact that the only way I've found to kill a container
is do "killall -9 lxc-start". For some reason a normal kill of
lxc-start is ignored. However, this still shouldn't leak kernel
resources like that.)


It is related to the kernel behavior : netdev with a rtnl_link_ops will
be automatically deleted when a network namespace is destroyed. The full
answer is at net/core/dev.c :


Um, default_device_exit_batch() maybe?  (6000 line file there...)

Unfortunately I can't rmmod a statically linked driver, and if you'vegot two e1000 devices in the system and are still _using_ one in thehost that's the wrong granularity level to re-probe at anyway.

If lxc-start could be killed by something other than -9 it could movethe device back to the host context on the way out. (Although really,the kernel should either retain the interface or provide a way tore-probe it. As it is, in my setup I can't figure out how to relaunch acontainer using a physical network device without rebooting the host.)

Is there a todo list for LXC? The lxc-development page doesn't link toa bugzilla...

Bug #2: When I uncomment that last line of the above busybox.conf,
telling it to move eth1 into the container but call it "eth0" in
there, suddenly the eth0 in the container gets entangled with the eth0
on the host, to the point where dhcp gives it an address. (Which is
10.0.2.16. So it's talking to the VPN that only the host's eth0 should
have access to, but it's using a different mac address. Oddly, the
host eth0 still seems to work fine, and the two IP addresses can ping
each other across the container interface.)

This is still using the most recent release version.


What is the kernel version ?

2.6.37-rc8, vanilla linus tree. (I applied some NFS test patches, buthaven't mounted NFS this boot so it shouldn't apply.)

Attached is an updated version of my first documentation file thatincludes the kernel configuration info in step 2.

Rob

To play around with containers, I chose to use a 3 layer approach:

Laptop - the host system running on real hardware (my Ubuntu laptop).
KVM - a virtual debian Sid system running under KVM.
Container - a simple busybox-based system running in a container.

So "Laptop" hosts "KVM" which hosts "Container".

The advantage of this approach is we can modify and repeatedly reboot the KVM system without interfering with the host laptop. We can also play with things like network routing without disconnecting the laptop from the internet.

Step 1: Create a root filesystem for the KVM system.

Here's how to creates a debian "sid" (unstable) root filesystem and package it into an 8 gigabyte ext3 image. The root password is "root". If you prefer a different root filesystem, feel free to use that instead. This procedure requires the "debootstrap", "genext2fs", and "e2fsprogs" packages installed.

This creates a smaller image and resizes it because genext2fs is extremely slow at creating large images.

You'll have to run this stage as root, and it requires network access. The remaining stages do not require root access.

sudo debootstrap sid sid

echo -e "root\nroot" | chroot sid passwd
echo -e "auto lo\niface lo inet loopback\nauto eth0\niface eth0 inet dhcp" \
  > sid/etc/network/interfaces
ln -sf vimrc sid/etc/vimrc.tiny
rm -f sid/etc/udev/rules.d/70-persistent-net.rules
echo kvm > sid/etc/hostname
echo cgroup /mnt/cgroup cgroup defaults >> sid/etc/fstab
mkdir -p sid/mnt/cgroup

BLOCKS=$(((1024*$(du -m -s sid | awk '{print $1}')*12)/10))
genext2fs -z -d sid -b $BLOCKS -i 1024 sid.ext3
resize2fs sid.ext3 8G
tune2fs -j -c 0 -i 0 sid.ext3

Now chown the "sid.ext3" file to your normal (non-root) user, and switch back to that user. (If you forget to chown, the emulated system won't be able to write to the ext3 file and will complain about write errors when you fire up KVM. Use your username instead of mine here.)

chown landley:landley sid.ext3
exit  # Stop being root on Laptop now

Step 2: Build a kernel for KVM, with container support.

The defconfig in 2.6.36 is close to a usable configuration, but needs a few more symbols switched on:

# Start with the default configuration
make defconfig

# Add /dev/hda and more container support.
cat >> .config << EOF
CONFIG_IDE=y
CONFIG_IDE_GD=y
CONFIG_IDE_GD_ATA=y
CONFIG_BLK_DEV_PIIX=y

CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_MEM_RES_CTLR=y
CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED=y
CONFIG_BLK_CGROUP=y
CONFIG_DEVPTS_MULTIPLE_INSTANCES=y
EOF
yes '' | make oldconfig

# Build kernel (counting CPUS to supply appropriate -j to make)

CPUS=$(grep "^processor" /proc/cpuinfo | wc -l)
make -j $CPUS

This builds a (mostly) static kernel, because rebooting kvm with a new kernel image is trivial, but copying modules into a loopback mounted root filesystem image is a multi-step process requiring root access.

Step 3: Boot the result under QEMU or KVM, and add more packages.

This invocation boots the newly built kernel with the sid root filesystem image, configured to exit the emulator when the virtual system shuts down. It allocates 1 gigabyte of memory and provides a virtual gigabit network interface hooked up to a virtual masquerading router (for the 10.0.2.X address range), with port 9876 on the host's loopback interface forwarded to the SSH port on the emulated interface.

kvm -m 1024 -kernel arch/x86/boot/bzImage -no-reboot -hda ~/sid.ext3 \
  -append "root=/dev/hda rw panic=1" -net nic,model=e1000 -net user \
  -redir tcp:9876::22

Log in to the resulting system (user root password root), and install some more packages to fluff out the SID install a bit.

aptitude update
aptitude install file psmisc less strace bzip2 make gcc libc6-dev dropbear lxc

Step 4: ssh into the KVM instance.

The KVM/QEMU console window is a nice fallback, but awkward for serious use. To get multiple terminal windows, or use cut and paste, we need more.

Redirecting a port from the host's loopback interface to connect to the port of the KVM instance allows us to ssh in from the laptop system. In step 3, we installed the dropbear ssh server, and the "-redir tcp:9876::22" arguments we used to launch KVM forward port 9876 from the host's loopback interface to port 22 of KVM's eth0, so we should now be able to ssh in from the laptop system via:

ssh [email protected] -p 9876

Remember, root's password is "root". (Feel free to change it.)

Step 5: Set up a simple busybox-based container under the KVM system.

The lxc-create command sets up a container directory with a new root filesystem. It takes three arguments: a name for the new container directory, a root filesystem build script, and a configuration file describing things like what network devices to put in the new container.

LXC calls its root filesystem build scripts "templates" (see /usr/lib/lxc/templates), the simplest of which is the "busybox" template.

Unfortunately, the default busybox binary in Debian sid is insufficient. The "busybox" package doesn't include the "init" command, and the "busybox-static" package doesn't have "login". To work around this, we download a prebuilt busybox binary from the busybox website, and add the current directory to the $PATH so lxc-create can find it.

We supply a trivial configuration file defining no network devices, mostly to shut up the "are you really really sure" babysitting lxc-create would spew otherwise.

wget http://busybox.net/downloads/binaries/1.18.0/busybox-i686 -O busybox
chmod +x busybox
echo -e "lxc.utsname = container\nlxc.network.type = empty" > container.conf
PATH=$(pwd):$PATH lxc-create -f container.conf -t busybox -n container

LXC creates the container's directory (including its config file and its root filesystem) under /var/lib/lxc.

Step 6: Launch the container

Launching containers requires the "cgroup" filesystem be mounted somewhere. (Doesn't matter where, LXC will check /proc/mounts to find it.) In step 1, we added an fstab entry to the KVM sid system to mount cgroup on /mnt/cgroup.

We also need the LXC command line tools, which we installed in step 3.

Now we get to experience the brittle bugginess that is LXC 0.7.3. The first step to launching an LXC container is:

lxc-start -n container

This starts busybox init in the container, which will tell you "press Enter to activate this console". Unfortunately, LXC's console handling code is buggy, and this console won't actually work. (Feel free to play with it, just don't expect to accomplish much.)

To get a working shell prompt in the container, ssh into the KVM system again and from that window type:

lxc-console -n container

This will connect to one of init's other consoles, which finally lets you log in (as root). Repeat: you hae to run lxc-start, leave it running, and run lxc-console in a second terminal in order to get a usable shell prompt.

Step 7: Stop the container, and the KVM system.

To kill the container, run this on the KVM system:

killall -9 lxc-start

(I don't know why lxc-start ignores everything but "kill -9". I think it's another bug.)

Note that killall undoes lxc-start. If you want to undo the lxc-create (delete the container from /var/lib/lxc), the command is:

lxc-destroy -n container

You can exit the KVM system by closing the QEMU console window, by hitting Ctrl-C in the terminal you ran KVM from, or by running "shutdown -r now" in the KVM system.

Summary

You should now be able to get a shell prompt in all three systems:

The host laptop.
The Debian sid KVM.
The busybox container.

Next time, we set up networking in the container.

------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl

_______________________________________________
Lxc-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/lxc-devel

Re: [lxc-devel] How does the console work in most recent release?

Step 1: Create a root filesystem for the KVM system.

Step 2: Build a kernel for KVM, with container support.

Step 3: Boot the result under QEMU or KVM, and add more packages.

Step 4: ssh into the KVM instance.

Step 5: Set up a simple busybox-based container under the KVM system.

Step 6: Launch the container

Step 7: Stop the container, and the KVM system.

Summary

Reply via email to