[OpenAFS] NAT and changing ip addresses

Jeffrey Hutzelman Jeffrey Hutzelman <jhutz@cmu.edu>
Mon, 8 Jan 2001 17:35:14 -0500 (EST)


On Mon, 8 Jan 2001, Jim Rees wrote:

> I think most of you are missing the fundamental problem with afs and nat.
> The problem isn't that the client's ip address can change (that's solvable;
> see below) but that each client must be able to receive packets addressed to
> it from the server to port 7001.  This is ok if you only have one client
> behind the nat box, but if you have more than one you have trouble.
> 
> When the nat box sees a packet originating at a client on port 7001, it sets
> up a flow such that any packet addressed from the server to the nat box on
> port 7001 will go to the client named in the flow.  Now if you add another
> client, you have two clients sending from port 7001, and the nat box doesn't
> know which client to send the reply packets to.  I see no solution to this
> problem.

Actually, there's a simple solution to this problem -- configure the NAT
such that each client gets its own source port.  There's absolutely
nothing in the AFS protocol or any major implementation which requires
that clients use port 7001.  That is done only because any RXAFS client
must also be an RXAFSCB server, and Rx presently requires that anything
that wants to be a server pick a port number up front.  I'll be submitting
patches later this week that allow a server to be started on the "random"
port assigned when rx_Init(0) is called, which will allow user-mode RXAFS
clients that use randomly-selected ports.

> The problem of a client changing ip addresses is easier.  We've been doing
> this for years in disconnected afs (which I hope to incorporate into OpenAFS
> at some point).  Right now it's done manually but I think it could be
> automated.
>
> When a client changes ip address it must discard all state associated with
> that ip address.  That state consists of callbacks on the client and server,
> and rx connections on the client (and maybe server).  Callbacks are
> discarded by calling afs_FlushVCBs().  At the end of this message I have
> attached the routine we use to discard connections.

With modern clients and servers, it's not even that bad.  The fileserver
identifies "hosts" by IP address and port, but keeps track of them by
UUID.  When it gets a connection from an (ipaddr,port) tuple it doesn't
know about, it does an RXAFSCB_WhoAreYou RPC to get the client's UUID and
a list of its interfaces.  h_GetHost_r() in viced/host.c tries real hard
to handle all the possible cases of clients moving around, addresses once
used by an old (pre-WhoAreYou) client now getting used by a new one, and
so on. 

There is one case that it arguably gets wrong, which is that when a
brand-new IP address is used by a hose whose UUID is already known to the
server, the new interface gets _added_ to the servers idea of what
addresses that host has.  This means that when the server tries to break a
callback to that host, it may try the "old" address first, which could
slow things down.  But this will only happen once -- when the server
discovers an interface on which it can reach a host, it keeps using that
interface for breaking callbacks until it stops working.  Note that if the
IP address in question actually gets reused by another client, the server
will end up throwing away the host structure for the first client, forcing
all of its callbacks to be broken the next time it talks to the server.

So, things should pretty much work now, as long as you have modern clients
and servers (IBM AFS 3.5 or newer, or any OpenAFS version).

-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
   Sr. Research Systems Programmer
   School of Computer Science - Research Computing Facility
   Carnegie Mellon University - Pittsburgh, PA