|More fun with containers.
||[Dec. 22nd, 2010|04:54 am]
Hanging out in Russia (the land of truly enormous spoons) with the engineers at Parallels' main office in Moscow, trying to absorb giant info-dumps and get over jetlag at the same time. They have an energy drink called "Adrenaline Rush" here, which is not bad.
Calling up google.com redirects to google.ru here, with no obvious way to make it not do that. (The non-obvious way to make it not do that turns out to be to visit google.com/en instead of google.com.)
Buying a local sim card for one's phone is apparently the same category of ordeal it is in the US, only with more paperwork and a language barrier. Decided to hold off until _next_ trip for that.
I wonder how you preview to see if you got the lj-cut tag right?
There are several different Linux contianers projects, but the one that made it into the vanilla kernel is loosely based on Google's cgroups stuff. The openvz one is most extensive feature-wise, but its userspace control knobs (the vzctl package) are based on system calls and ioctls that are only in the openvz kernel, not in vanilla. The new project started to add controls for what's in vanilla is the LXC package, which is full of rough edges and coded in the worst IBM "infrastructure in search of a user" style. (Maybe I'm spoiled by embedded development, where unused code is ruthlessly removed. Writing code you don't actually need and aren't currently using, just in case somebody someday might, is considered a BAD thing there.)
So I'm using the LXC package to set up a containers test environment, which is kind of painful. LXC's development history seems to consist largely of patches like this which literally do nothing but make the code bigger and more convoluted. They insist on needing capabilities but their scripts check to make sure they're running as root. You have to read through a huge amount of code to find the parts that are actually _doing_ anything, rather than just wrappers passing data around between themselves and checking that the previous wrapper layer agreed with the current wrapper layer about the format of said data. So when the code does something wrong, finding the relevant bits is the kind of expedition you bring Indiana Jones along for.
Anyway, I built a containers-enabled kernel, built a debian sid chroot and made an 8 gig ext3 image out of it, launched the two of them under kvm, fetched the lxc source and built it inside the kvm system. Then I built a defconfig busybox 1.18.1, and used lxc's bsybox "template" (which is the term they invented for a chroot creation script... did I mention these guys love making up new terminology for no apparent reason?) to create a busybox container:
lxc-create -n 12345 -f doc/examples/lxc-macvlan.conf -t busybox
Using a statically linked busybox on the theory that I'm very familiar with it and it's probably the simplest root filesystem you can do, thus a good place to start.
And I launched said container:
lxc-start -n 12345
And it SORT of worked. Except that the shell prompt I got echoes all the characters I type back at me, but only listens to at most two characters typed per second. If you don't put a half second delay between characters you type, they get dropped.
Another fun detail is that only works the first time you start a container (after creating it). The second time, it fails with:
Failed to symlink /dev/pts/ptmx -> /dev/ptmx file exists.
And as soon as that happens, the host kernel starts spamming the console with "unregister_netdevice: waiting for eth0 to become free. Usage count = 1" and this will continue forever until you kill kvm and restart the Debian Sid host.
Let's recap: I have my laptop (running Ubuntu) running a KVM with a guest system based on Debian Sid, and inside the Debian system I'm playing around with LXC and containers to run a busybox root filesystem. That's two virtualization layers for three systems total: Ubuntu, Sid, Busybox. The second time I try to launch the busybox system, the Sid system gets confused and starts printk() spam to the KVM console that I can't stop until I restart KVM.
Hmmm, the unregister_netdevice thing is fixed by using the vanilla kernel instead of the nfs -git branch. So that's one bug down, a half-dozen to go, and then I can start paying attention to the part all this is supposedly just a test environment for....