I miss the days I studied medicine in college. (I stopped because you can't even become a _nurse_ without having to give people injections, and I have a problem with needles.)

So about ten years ago, we figured out how to bring people back from the dead (at least in a very limited way), and it barely even made news.

For years people weren't considered dead until they were "warm and dead". Drowning victims in cold water were revived without obvious ill effects after being underwater for _hours_. Great scientific discoveries don't start with a shout of "eureka", they start with "that's odd", and this was very odd. In the early 2000's, we finally started to figure out what was going on. Unfortunately, the current medical establishment is horribly set up when it comes to actually applying it.

If you google "cold therapy for heart attack" you'll find a decade's worth of articles about how it turns out brain cells aren't really dead after 4 minutes, or 8 minutes, or even an hour without oxygen. They last about as long as any other tissue, you just have to restore circulation before gangrene sets in (I.E. before the cells die). How long is that? Well, detached limbs can generally be reattched after 6 hours without circulation, longer if you keep the part cold. Kidney transplants are packed in ice (kept very cold but not quite allowed to freeze) and they try to stay under 30 hours from harvest to implantation, which is more than a day. You can sometimes reattach a limb a couple days later if it was kept cold enough. (Even hibernating mammals don't quite go to the level of the mushrooms and onions still alive afte a week in a plastic bag in my refrigerator, but cutting off circulation to your arm for an hour while sleeping just makes it numb and then pins and needles when you restore circulation. It's not "oh no, all your nerves are dead".)

So why is the brain so different? What happens is brain cells literally self-destruct when oxygen is restored, because the cell runs a self-diagnostic on reboot, notices it's running _way_ out of spec, and detonates on the assumption it's gone cancerous or has a virus or something. (This is called "autolysis" and is fairly common, the cells in higher organisms are there to serve the body as a whole, not to keep themselves alive. Skin and hair cells form serve their purpose while dead. The digestive tract is lined with cells that _get_digested_ along with the food. The placenta in mammals is made from the exact same cells as the fetus, it's just forming scaffolding rather than the part you keep, and the fetus itself kills more cells than it keeps positioning tissues into organs and such.)

Cells self-destructing when they notice they're defective is our body's first line of defense against cancer and viruses, and it's especially important in "immunopriveleged" areas like the brain which are mostly isolated from the rest of the body in a way that the normal immune system can't reach quite easily: self-policing is the main option there. That's why your central nervous system and peripheral nervous system respond so differently here: the blood/brain barrier enforcing different immunoprivlege domains. This filter membrane keeps most infections out, but also prevents white blood cells from spotting and fixing problems. Some viruses are so small you can't keep 'em out without also excluding oxygen and nutrients, cancer is existing cells going nuts (generally due to some sort of damage, although botched division is a possibility), so these cells must self-police much more aggressively.

So brain cells are on a self-destruct hair trigger to ward off brain cancer and viruses (or at least keep it down to a dull roar), and in the case of oxygen deprivation the self-destruct mechanism gets triggered incorrectly, which is more or less an autoimmune problem. All that "keep CPR going" stuff is to prevent the oxygen levels in the brain from ever getting low enough to go out of spec to trigger cell autolysis and thus brain damage when oxygen is restored.

But the self-diagnostic can't run when the cell has no power, I.E. has run completely out of oxygen. The cells die when oxygen is _restored_. And it turns out if you cool the brain way down to not-quite-freezing you can re-oxygenate those cells before the self-diagnostic can run, so that by the time the cell wakes up it's operating within sufficiently normal parameters that it considers itself worth salvaging. So if you can cool down a fresh enough corpse, re-oxygenate all the cells, and then carefully warm it up again (juggling the normal problems of hypothermia; if the body's too cold it can't generate heat metabolically, and the heart won't beat either)... people can turn out to be less dead than you expected.

And this turns out to work pretty well, except that it's fighting the body's normal response to cold (our cells have a higher heat tolerance than cold tolerance, only a few degrees below normal and we pass out, so the body actually cuts circulation to the extremities to keep blood and the remaining oxygen and warmth going through the heart, lungs and brain... exactly where we _don't_ want it if we're trying to cool it down while keeping it oxygen-depleted until we're ready). So just dumping ice on a person may not lower their core temperature enough to make a difference, and will give the extremities frostbite as you cool those cells down to freezing and ice crystals forming in the cells tear them apart. And then if you _do_ lower the core temperature enough that the cells aren't functioning, that means the heart's too cold to beat, so you've then gotta treat the hypothermia...

So what you do is hook your victim up to a heart-lung machine machine and cool their blood way down _before_ you start oxygenating it. Then very slowly warm them back up until their heart starts beating again on its own, and hope their brain reboots without smoke coming out. The heart-lung machine takes care of hypothermia. It's also hugely invasive and expensive, but it's a technology we happened to have lying around already. (You can also do stuff with tubs of ice water if that's what you've got, but it's much harder to control. Still, that's something _everybody_ has.)

How cold do you want to get people? Water's maximum density is around 4 degrees celsius, above that it behaves like normal materials and shrinks as the atoms it's made of vibrate less, below that it starts to form persistent hydrogen bonds which keep the molecules at arm's length from each other. Enough hydrogen bonds and you've got ice crystals, which are sharp and bigger than the water was, and expanding sharp crystals tend to shred things they form in, whether it's concrete or cells. So 4 celsius degrees is a decent target. But it turns out cells stop _functioning_ much warmer: normal body temperature is 98.6 degrees farenheit, at 89 degrees you lose the ability to generate heat (even by shivering), at 86 you lose consciousness. Heartbeat stops being reliable below 82 degrees and stops completely around 65 degrees (all farenheit). So it's quite possible _room_temperature_ is enough to stop neural autolysation, we don't know. (It's kind of hard to find volunteers to experiment on for this sort of thing.)

The weird part is that at a colder temperature chemical reactions run more slowly, so a cold cell actually consumes very little oxygen, meaning oxygen can actually build back up by diffusion even without the circulatory system running. (Even a big cactus doesn't need a heart or lungs; I'd say "tree" but dead wood surrounded by a thin layer of living tissue that oxygen can diffuse into.)

So you get persistent reports of dead people "coming back to life" at funerals and such in third world countries, which _might_ be because the cells weren't actually dead yet, the brain cooled down below the point it could run a self-diagnostic when oxygen was restored, oxygen diffused into the mostly inert cells, the body slowly warmed up to the point where the heart started beating again (people do recover from hypothermia on their own sometimes)... and the "dead" person recovered. Maybe.

It's VERY UNLIKELY for all those things to happen just right, but one of the open secrets in medicine is that defining "dead" is really hard, because people just don't cooperate. Our culture has the phrase "left for dead" for a reason, and the elaborate medieval attempts to avoid being "buried alive" because of this are well documented.

In fact, modern mortuary practice is built around making sure mistakenly thinking someone was dead and being proven wrong never happens around here anymore by "killing the corpse", I.E. sticking the body in a freezer, draining their blood, and replacing it with embalming fluid to provide CERTAINTY that they're dead _now_. But back when The Wizard of Oz was written (in 1900), "not only merely dead, really most sincerely dead" was still saying something, and it apparently still happens countries with different funeral practices than ours.

Of course less-modern countries are even less likely than we are to go "what happened here medically" instead of dismissing it as a miracle. And when _we_ hear about it we naturally assume they're idiots who missed a heartbeat, as opposed to "somebody's heart stopped for 4 hours, then started up again, and their brain cells re-oxygenated without triggering widespread autolysis. I would _love_ to know all the temperature thresholds and diffusion rates involved in that".

In _both_ cases our preconcptions about what must be happening blind us to asking the right questions. Of course we're familiar with _that_ story from do bacteria cause ulcers and how does cholesterol actually work and a thousand similar things. (Imagine if every new computer processor generation had to go through FDA clinical trials, meaning you had to convince a panel of octegenarians your new idea was worth ten years of funding to see if it _might_ produce good results a decade or two from now.)

And another fun thing here is that this is a therapy, not a drug, so the drug companies aren't interested. The most effective treatment for cystic fibrosis (chest percussion) is a therapy, I.E. a technique a trained person performs. A drug company can't charge per dose for that, so they've been desperately trying to replace it with some sort of pill or inhaler for decades now. (This is an oversimplification, but our medical system being horked beyond imagining is not news. We literally have multiple cartels cornering the market on health care services without even bringing the drug companies or HMOs into it.)

And of course we're sitting on some known variants such as Apoa-1 Milano, where somebody got a patent (on a natural mutation) and is sitting on it until it expires, and _then_ maybe progress might happen.

But this sort of "we know how to improve stuff, but it's not profitable" problem isn't limited to medicine. It's limited to any place enough money accumulates to attract parasites. 3 of the 4 largest companies in the Fortune 500 right now are oil companies; let's ignore the whole "hire the tobacco institute to discredit global warming" thing, and go back to basic physics. If you mix water and gasoline with a detergent, the resulting mix has the same miles per gallon as pure gasoline (because the waste heat turns the water to steam, and you get an internal combustion steam engine). Any high school student with a lawn mower engine can demonstrate this for themselves over a weekend. So why aren't we making use of it commercially?

In the 1960's there was a device called "the bubbler" you could retrofit your car with. I first heard about this from a cnn piece in 1992, some college professor had rediscovered it mixing water and gasoline 50/50 with a simple detergent. My electronics lab professor back in college said there was a device called "the bubbler" back in the 1960's you could retrofit your car with to mix the water/gas/detergent as you used them (to avoid settling back into layers like salad dressing in the tank, with the detergents of the day). In the late 90's Catepillar (the construction equipment people) had a patent on some sort of microscopic sponges that formed a colloid sludge that wouldn't settle out, and which let you go up to 80% water 20% fuel. And of course there's ultrasonic emulsification and so on. People keep rediscovering this but it never goes anywhere.

Expecting the US medical complex seriously start using something as simple as refrigeration to keep people's brains alive is a bit like expecting Nasa to put a colony on mars: we won't live to see it because they suck so badly they were actually _better_ at a lot of things 40 years ago than they are today. And for most of that time the alternatives were homepathic quacks and astrologers, respectively, so looking outside the market the guilds have cornered is a wasteland of scum and villainy.

At least NASA was founded with a vision (although once it put a man on the moon and returned him safely to earth before that decade was out, it went on to coast for 40 years of sheer bureaucratic inertia), and more recently the X-prize kicked some life into 'em (Space-X, Dragon, woo!)

Back in college, I'd hoped the biotech companies might do some of the same sort of cutting-edge research with medicine, but there's actually not a lot of money in cures. The money is in _treatments_ for chronic conditions you can milk for decades, actually _curing_ people kills your cash cow. So instead they do cargo cult programming on the genomes of food, and wind up with cattle feed grass that produces clouds of poison gas. Because that's where the money is. Wheee.

Oh, and another fun litle dysfunction: of course nothing in this article should be construed as medical advice, you lawsuit-happy bastards. (If you were wondering why Rome fell...)

I'm really look forward to the baby boomers dying off and ceasing to have a disproportionate impact on the country. The current republican party is the penance we pay for the 1960's. They were teenagers when we went to the moon and invented the internet, now they're dried up old fogies trying very hard to take it (the country) with them and make us all pray them into an afterlife of their choosing. (This is apparently how you make progress, old people die and the new people use the ideas they learned before they knew everything.)

Next time, why my old Yahoo email handle back in the 90's was "telomerase". :)

  • Current Mood
    cold cold

Oh right, livejournal.

In case it isn't obvious, I went back to my other blog when the parallels gig ended. I learned a bunch and Kir is a great guy (running the OpenVZ both with him at scale was the highlight of the whole experience), but looking back I have a few observations:

1) Telecommuting for a company on the other side of the planet is challenging even when you're _not_ their only telecommuter. Being on the other side of the planet from a bunch of guys all in the same building is not a happy thing.

2) The technologies pitched in an interview (containers!) and the technologies you wind up working exclusively on the entire time (NFS) aren't always on the same continent either.

3) Culture clashes can be a thing. (The Russian consulate had a brochure warning me about the not smiling thing. It should have had one about "If someone has a complaint, they will stop talking/replying to you for months until the source of the complaint goes away. When this is your immediate supervisor, it can be a problem.")

I am sad that two round trips to the other side of the planet don't add up to enough Delta frequent flyer miles to get one domestic flight. I am happy I don't have to go there again. (When your trip preparation instructions explain the amount of cash you should carry to pay the standard police officer bribe from the random shakedowns, when the water cooler conversation is explaining _why_ the government stole an oil company, when you have to assure relatives that the bomb in the airport was a week after your trip and anyway it was the _other_ airport in the capital city, when boingboing coincidentally posts more than one long article about the murder and kidnapping of foreign entrepreneurs impacting investment in that country... Not a place I felt a huge _need_ for a third visit to.)

Still: learning experience.


todo list collating.

My todo list has once again exploded to the point where everything is distracting me from everything else and I'm forgetting what my todo items ARE, so it's time to write it down and prioritize again. (And this doesn't even include long-term stuff like containerizing the 9p filesystem or fuse, or testing LXC on non-x86 hardware platforms.)

Collapse )

Yay code review. NFS Lifetime rules are still brain-bending.

Ok, found a workaround for the linux-2.6.39-rc1 hang that Jens Axboe's been distracted from solving for a couple weeks: disable preemption. So I can go back to testing/developing against linus's current -git tree instead of 2.6.38, which is good.

My Ottawa Linux Symposium paper submission 'Why Containers Are Awesome' has been approved, meaning I need to actually _write_ the paper now. I'm collecting a file full of links, might take a stab at it this weekend...

On the NFS front I'm pulling back from my ongoing battle with lockd/statd (which are a horrible mess from a design level and apparently always have been), because I managed to poke some people into reviewing my NFS patches (yay!), and Serge Halyn raised a good point about lifetime rules in the third patch. Which means I have to reevaluate the lifetime rules, which are always the hard part with kernel stuff. Sigh.

Collapse )

Back from a week in Moscow.

So I got my NFSv3 containerization patches submitted. There are three of them for the basic network namespace support for NFSv3 in what's probably correct approach. So far, nobody's cared to comment on them (even the people who were interested in the topic before, who I cc'd on the submission), although I tracked some down on skype and they promised to bump review up on their todo list.

Today, in theory I'm containerizing lockd (and statd, both of which are a horrible incestuous nightmare which is probably going to require one instance per container). In practice, I got distracted reading the LXC source code. It's... very verbose for what it does.

Collapse )
  • Current Mood
    chipper chipper
  • Tags

I need more test cases.

I've either run into a weird subtle bug in the kernel, or a weird subtle bug in kvm, and I can't tell which it is.

When I set up the "two meanings for the same IP" routing, mounting NFS inside the container (via tun/tap eth1) makes that address say "no route to host" outside the container (via -net user eth0). The two should be orthogonal, but something's getting interfered with.

I can reproduce it with a kvm "-net user" interface and a tun/tap interface, but I can't reproduce it with two tun/tap interfaces attached to kvm. I can reproduce it with nfs access in the emulated kernel, but not from userspace.

Except those previous two statements conflict. KVM doesn't know anything about userspace vs kernel space in the emulated kernel, and the kernel doesn't know about differing virtual implementations behind the e1000 emulated hardware interface.

It's hard to debug a problem with the -net user interface because I can't ping or tracepath through it, so when it's failing to connect I dunno _why_. Which is why I switched eth0 to be another tun/tap interface and tried to replicate the bug there, but so far I can't. Except if the bug _is_ in qemu's -net user, I should be able to reproduce it from userspace in the emulated kernel. KVM has know way of knowing if packets come from userspace or from kernel space inside the emulated system.

Grrr. The worst kind of debugging issue is "I changed something irrelevant and the problem went away". THAT'S NOT HOW DEBUGGING WORKS. You find out what was wrong and fix it, or it resurfaces to bite you again later.

Hmmm, maybe I can upgrade qemu (switch from kvm to qemu and build from source via current qemu-git repository) and see if _that_ makes the user+tap problem go away. Fixing it via upgrading qemu is reasonably strong evidence it was a bug in qemu, and if so it's orthogonal to the NFS patch (and probably fixed upstream already anyway, ubuntu's kvm is a bit old and ubuntu has a history of breaking qemu anyway)...
  • Current Mood
    frustrated frustrated
  • Tags

Why debugging is harder than coding.

To test NFS containerization, I need to set up conflicting network routing. I need to come up with something the container can access but the host can't, and vice versa. To do this, I used my three layer setup (laptop/kvm/container, described here) so I can set up routings on the laptop which are a couple hops away from the point of view of the containers and the container host (I.E. the kvm system).

Initially, to get a routing the container could access and the host couldn't, I set up a alias on the laptop and ran an NFS server on it. The KVM's eth0 interface is using the default QEMU masquerading LAN and thus gets the address, so it can't see any _other_, and life is good.

Then to set up a routing that the KVM system could see but the container couldn't, I ran an NFS server on the laptop's The KVM system could access that via the alias in the virtual LAN its eth0 is plugged into, but the tap device the container uses has no way to route out to, the container would see its own loopback interface instead.

Then right before SCALE I changed my test setup so that the mount command in the container and the mount command on the kvm system were identical, both using On the laptop I set up a alias, and ran the NFS server on that and another instance on So the container and the container's host were both mount NFS on, but should be connecting to DIFFERENT servers when they did this.

The failure mode for the first test setup is "server not found", because if it uses the wrong network context, it'll route to an address that isn't running an NFS server. (The local address hides the remote address.) The failure mode for the second setup is accessing the wrong server: it's always going to route out to a remote address, the question is whether or not it gets the right one. (Side note: unfortunately you can't tell the NFS server "export this path as this path" because NFS servers are primitive horrible things configured via animal scarifice and the smoke signals from burnt offerings. What I should really do is run one the two instances in a chroot to get different contents for the different addresses. Generally I just killed one to see if that NFS mount became inaccessable or not.)

Over the past couple months I've made the first test setup work (more by adding lots of "don't do that" options to the mount -o list than by patching the kernel, but at least I got it to _work_). This second test setup spectacularly _did_not_work_, and it failed in WEIRD WAYS. Not only does the NFS infrastructure inappropriately cache data and re-use structures it throught referred to the same address (because its cacheing comparisons didn't take network context into account), but the network layer itself doesn't seem entirely happy routing to two different versions of the same IPv4 address at the same time. (After doing an NFS mount in the container, the host can't access that address anymore. Can't do an NFS mount, can't do a wget... an ssh server bound to that address can't take incoming connections. UNTIL, that is, the container opens a normal userspace connection to that address, such as running wget in the container. I have no idea what's going wrong there, but it's easily reproducible.)

The problem is, right as I started debugging this second test setup I was pulled away for several intense days of working the OpenVZ booth at the SCALE conference, and then I got sick for most of a week afterwards with the flu, and by the time I got back to working on NFS I'd forgotten I'd changed my test setup.

So now I was dealing with VERY DIFFERENT SYMPTOMS, and all sorts of strange new breakage, and I couldn't reproduce the mostly working setup I'd had before, and I couldn't figure out WHY. At first I blamed my "git pull" and tried to bisect it, but a vanilla 2.6.37 was doing this too and I KNEW that used to work. Sticking lots and lots and lots of printk() statements in the kernel wasn't entirely illuminating either. (Once you're more than a dozen calls deep in the NFS and sunrpc code, it's hard to keep it all in your head and remember what you were trying to _do_ in the first place.)

And of course the merge window opened, so I wanted to submit the patches I'd gotten working so far, but I always retest patches before submititng them and when I did that they DID NOT WORK so obviously I couldn't submit them until I figured out WHY...

It wasn't until today that I worked out why what I had USED to work, and where I'd opened a new can of worms that broke everything again. The code hadn't changed, my test sequence had. (It's a perfectly valid test sequence that _should_ work. The kernel _is_ broken. But it's not _new_ breakage, and my patch does fix something _else_ and make it work where it didn't work before, and thus I can give it a good description to help it get upstream.)

So yeah, I've had a fun week.