You are viewing landley

The Conversation Pit - Catching up on writeups. [entries|archive|friends|userinfo]
Rob Landley

[ website | My Website ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Catching up on writeups. [Feb. 3rd, 2011|05:00 pm]
Previous Entry Share Next Entry
[Tags|]
[mood |irateirate]

This is from the second half of last week, but to get it out of my pending directory:


With NFS, even _finding_ where the address info is stored is a huge challenge. "Start with mount?" you say? If you grep for the string "mount" in fs/nfs you get 491 hits.

Start with the address? Of _what_? Remember how there isn't _one_ NFS server? There's portmap and mountd and nfsd and it does DNS lookups and lockd and there's something called a "referral server" which I don't even want to KNOW about, although possibly that's just NFSv4. Oh, and it has explicit support for automount, which kind of defeats the purpose of automount if you ask me. Oh, and the cacheing level, fscache.c "per-client index cookie". There is an "nfs_clone_mount" struct, which really strikes me as the sort of thing the VFS layer should be handling, but there's legacy crap everywhere. This entire filesystem _is_ legacy crap.

Alright, let's start with "struct file_system_type" which has to be defined somewhere, and should tell us what mount function actually gets used? So grep for struct, the name of the struct, and a line with an equals sign on it, and it turns out that in super.c there are a bunch of them:

super.c:static struct file_system_type nfs_fs_type = {
super.c:struct file_system_type nfs_xdev_fs_type = {
super.c:static struct file_system_type nfs4_fs_type = {
super.c:static struct file_system_type nfs4_remote_fs_type = {
super.c:struct file_system_type nfs4_xdev_fs_type = {
super.c:static struct file_system_type nfs4_remote_referral_fs_type = {
super.c:struct file_system_type nfs4_referral_fs_type = {

Lovely. Of the seven, _five_ of them are for NFS4. In case you wonder why this protocol hasn't taken off and exclipsed NFS3, that's about the amount of complexity bloat you get with NFS4. "NFS is too overcomplicated due to the legacy of premature optimization and the tendency to layer new code over problems instead of actually fixing what's there. Let's make a whole NEW one that's MUCH MUCH BIGGER!" Sigh. It's "the future" the way ISDN was the future, the way Microsoft Bob was the future. Zero uptake so far, and much head-scratching on the part of its proponents as to why.

So let's look at the other two. (Glancing again at the "multiple mounts share a superblock" comment at the top of super.c and pondering once again how to sabotage that for containers.)

It seems that the difference between nfs_fs_type and nfs_xdev_fs_type is that nfs_fs_type hasn't got a .mount entry and the xdev one does. (What?) Except that the xdev thing appears to be the automount stuff, so that mount function is _not_ the one that mount actually calls.

I hate this codebase.

Meanwhile, over in "looking at the wire-protocol and reading the RFCs" land, whoever wrote these RFCs should really be shot. The less-irrelevant ones are:
RFC 1014 - XDR
RFC 1057 - RPC
RFC 1094 - NFSv2
RFC 1813 - NFSv3
Alright, here's the effective protocol for the start of an NFSv3 mount, skipping portmap. All these fields are stored big-endian:
struct rfs_request_crap {
  uint32_t seqnum;   // random cookie to match up request/reply/retrans
  uint32_t is_reply; // 0 for calls, 1 for replies
  uint32_t always2;  // always equals 2
  uint32_t server;   // portmapd, mountd, nfsd, lockd
  uint32_t version;  // 3 for NFSv3
  uint32_t function; // Which function to call
  struct auth_crap { // session info, basically.
    uint32_t type;   // 0 = none, 1 = unix, 
    char padding[];
  }
  char args[];       // Function arguments (if any)
}
The session stuff includes a "verifier" that predates modern cryptography. You kind of want to pat it on the head for being so utterly useless.

So, in the first packet:

00000000 04 8a 68 16 00 00 00 00 00 00 00 02 00 01 86 a5
00000010 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 00
00000020 00 00 00 00 00 00 00 00

We have: cookie, not a reply, the 2, a request for server 0x186a5,
nfs version 3, function 0, and 16 bytes of padding. There's no
session yet, and there's apparently no arguments. I dunno how long
either field's supposed to be yet.

The reply:
00000000 04 8a 68 16 00 00 00 01 00 00 00 00 00 00 00 00
00000010 00 00 00 00 00 00 00 00
struct rfs_reply_crap {
  uint32_t seqnum;       // Same cookies as last time
  uint32_t is_reply;     // It's 1 this time.
  uint32_t was_rejected; // 0 for success.
  char session[];        // Insecure authentication crap preda
  uint32_t status;       // 0 for success
  char results[];
}
Note that the direction these packets are going in apparently has no bearing on whether or not they need to be annotated with an is_reply field. Sigh.

The "rejected" field means either it wasn't well-formed (ha!) RPC/XDR, or it lost track of your login session. The "status" only enters into it if the thing actualy managed to run a remote procedure to have something to return.

I continue to hate NFS.
linkReply