[OpenAFS-devel] rx multihoming

Andrew Deason adeason@sinenomine.net
Fri, 16 Aug 2013 13:15:29 -0500


[new thread, trying to get this away from the rest]

On Thu, 15 Aug 2013 11:49:51 -0400
Jeffrey Altman <jaltman@your-file-system.com> wrote:

> On 8/15/2013 11:18 AM, Jeffrey Hutzelman wrote:
>
> > That is a bug.  Either the address and port are part of the
> > connection identifier or they're not.  If they are, then the
> > connection from a different address and port is, to the server, a
> > different connection and must be treated as such.  If not, then it
> > should treat the incoming packets as part of that connection.
> > 
> > Simply dropping incoming packets on the floor because they have the
> > same epoch/cid as an existing connection but not the same address is
> > not reasonable.  Of course this is causing problems!

Jeff A I believe is describing what is actually happening, though. I'm
not sure to what degree this is avoidable. I'm sure you are aware of a
lot of the background here, but for everyone:

For connections with the multihoming bit set on the epoch (which iirc is
all connections we create these days), each endpoint will accept packets
from any address with that epoch/cid. From what I recall in discussing
this before is that we do this because if we don't do this, if a machine
that sends a UDP packet on a different outgoing interface than we
expect, the packets all get dropped and there's not much we can do about
it.

For example, say you create a connection for fileserver address
192.0.2.100. Maybe that IP is an IP it uses specifically for AFS, and
it's "main" IP is 192.0.2.5. When the kernel send a UDP packet to
respond, it goes out over 192.0.2.5. So the client will get a UDP packet
from 192.0.2.5. If it restricts itself to ignore any packets not from
192.0.2.100, then the connection will never work unless we create a
connection to 192.0.2.5 specifically; we have no way of knowing that is
the correct address.

However, in order to try to avoid anyone in the world from hijacking the
connection completely, a peer will only _send_ packets to a specific IP,
even if it accepts packets from any IP. So in the scenario described by
Jeff A, the fileserver would accept packets from that connection if the
client appears to have "moved", but it would only send packets back to
the old address ("packets" being all pings, acks, etc) so from the
client's perspective, the fileserver appears completely dead for that
connection.

By the probable reasoning of the person that originally wrote this code,
this should only be necessary in one direction. That is, in the example
I gave where you need this multihoming stuff, only the client should
need to accept packets from any IP, since it doesn't know what address
the response packets will come from. The server knows that it can get
packets from the original IP, since it already got at least one packet
from there. If it gets a packet with matching cid/epoch but from a
different IP, it may be the same client, but it can easily be a new
connection.

In fact, the code used to only apply this restriction in one direction,
and the code to do that is commented out right now, not entirely
deleted. This was removed and now applies in both directions as of
commit b4566d725e1aa4f57d1e6db5821c590a4b6da7c0 from 2004. I'm not sure
exactly what issue that commit message is referring to, but maybe some
multihomed clients can e.g. alternate IPs for the 'from' address on
outgoing packets. Maybe someone else can justify that better.

> I suspect that Jeff is correct and the new connection is being created
> but the sum of all of the things that happen at new connection time
> the client times out and marks the server down:

I don't think so; even with how broken idledead is, at least with
current code the idledead timeout should be really high for RW volumes.
I think you were right the first time.


I also think this has very little to do with NAT ping, but it's good to
understand and discuss nonetheless.

-- 
Andrew Deason
adeason@sinenomine.net