December 29th, 2010

Poking at NFS.

I've already composed and discarded two blog entries on NFS because I haven't comfortably wrapped my head around it yet. The kernel's Documentation/filesystems/nfs directory is useless, and the kernel's NFS source is a bit big to have an obvious entry point, partly because the way it's written you have to understand the VFS layer first. I re-read the paper for the last NFS talk I saw (OLS 2006, Why NFS sucks), which was more entertaining than informative. Its successor (Making NFS suck faster, with video) got quite interesting at around the 12 minute mark with the "principles of operation" overview (portmap, rpc.mountd (interperets /etc/exports, dns lookups), NFS threads, and so on). But as far as I can tell there's no substitute for reading RFC 1813 describing NFSv3 (circa 1995, meaning it's about as current as ext2). (I haven't read the previous one describing NFSv2 or the new one describing NFSv4 yet.)

Part of my problem is I've spent the past 10 years successfully avoiding NFS. The last time I had to use NFS was on Sun Workstations at college in the early 90's. When I moved to Linux, NFS was "that thing Red Hat 6 inexplicably exposes to the internet so worms can trivially root your server", the research I did at the time said it couldn't _be_ secured for general exposure to the internet (unlike Samba), and my interaction with it was to kill it with fire as part of OS configuration.

All the Linux setups I've seen in the past decade used Samba instead. (I'm aware my experience isn't universal and there are plenty of people using NFS, I just don't seem to know any of them.) When I first encountered QEMU (2005) it already had an option to automatically provide a samba share to the virtual system, but it still doesn't have an option to launch NFS today. A samba connection works over a single normal TCP/IP connection, with the client dialing out to the server and without the inexplicable need to initiate connections the other way through the firewall, and you can wrap it in ssl or tunnel it through ssh as you like. If the connection goes down, the client automatically attempts to reconnect. The server is a normal userspace process, just like web servers and ssh severs, and samba has extensions to support posix. It's simple, straightforward, and capable of being exposed to the internet without guaranteeing a server compromise.

A few months ago one of my Aboriginal Linux contributors pointed me at unfs3, which sort of works. But I only used it for debugging purposes, and it turned out to hang under load when restricted to TCP operation, and I didn't really look into how it was implemented.

However, I've been asked to fix NFS mounts in containers, and after a couple weeks getting containers to work, I'm now familiarizing myself with the low-level design of a technology I've avoided looking at for years. And it is not a well-contained problem.

Oh, the additional fun is that I can't work in the vanilla kernel for this, so I have to make git branches work. Previously I've just played with the vanilla kernel and applied patches myself. I've used mercurial for years, but with git I learned just enough to make git bisect work because the user interface is utterly horrible. Apparently there's no way to check out a remote git branch without making a new local branch in your git repo. The NFS maintainer's tree has a master branch, and an "everything" branch, and an "nfs-all-next" branch, and a "for-2.6.38" branch, and I have no IDEA which of them I'm supposed to be using or what the differences between them actually are. (Note: git describe --tags shows 2.6.35-rc1 for all of them. According to "git log" the for-2.3.38 branch has merged 2.6.37-rc6 but the tags didn't propogate or something.)

But that's a blog post of its own...
  • Current Mood
    confused confused
  • Tags