[OpenAFS] Re: Ubik trouble

Jeffrey Hutzelman jhutz@cmu.edu
Mon, 13 Jan 2014 12:32:12 -0500

On Mon, 2014-01-13 at 15:00 +0100, Harald Barth wrote:

> (1) I had an old NetInfo file with a wrong IP addr lying around. This
> id _not_ prevent the server to start nor to prevent sync completely.
> The protection server synced fine and the volume location server
> refused.

The NetInfo and NetRestrict files serve as filters on the actual set of
addresses found by enumerating interfaces.  Mentioning an address that
the machine does not have has no effect.

> (2) I have a machine where the database server is known as X.Y.Z.43
> but the machine's primary IP is X.Y.Z.46. This seems to work well
> until something somewhere checks the source address of the traffic
> when sync is tried. Result: The protection server synced fine and the
> volume location server refused. 

I'm not sure why your vlserver and ptserver are behaving differently,
unless they are different versions or you have some port-specific filter
or the like.

When multi-homed Ubik servers are used, the CellServDB used by the Ubik
servers must list each server exactly once.  Further, each server's
CellServDB must use the same set of servers; it won't work to have a
server identified by one address in one copy of the file and a different
address in another copy.  The CellServDB files used by clients and
fileservers can list every address for every server, though getting a
fileserver and Ubik server on the same machine not to use the same
CellServDB can be... challenging.

The way Ubik takes advantage of multi-homed servers is to dynamically
discover the additional addresses of each server.  Whenever a server
starts, it exchanges addresses with each other server, or at least the
ones that are actually up.  Once this is done, each of those servers is
able to contact the other using any of its addresses.  However, only one
address is used at a time -- Ubik doesn't start trying a new address for
a multi-homed peer until the one it's been using stops working.

Like over-the-network communication in AFS, Ubik server-to-server
communication is done using Rx.  Particularly, the voting protocol is
based on each "candidate" making an RPC to each other server; the vote
is encoded as the return value of that RPC.  What that means is that a
server has no opportunity to try sending its votes to multiple
addresses; it can only send one response, which necessarily goes to the
address that made the RPC.  So, if you have a network condition which
blocks traffic between two servers in only one direction, voting will
not work.  However, this normally will sort itself out, at least
partially, because the server making the Beacon RPC will see this as a
timeout and treat the other server as down.

A worse situation arises when server A makes an RPC to server B, but the
best route from server B back to the original source address goes via a
different interface than the request came in on.  In this situation, the
kernel will assign the wrong source address to server B's outgoing
reply, which may cause Rx on server A to drop it on the floor.  This is
the problem that -rxbind is designed to work around, at the expense of
the server not really being multi-homed, at least as far as AFS is
concerned.  Whether this problem arises depends on your network
topology, but generally, you will have problems any time server B has
multiple interfaces whose best route from A uses the same outgoing
address.  This includes cases where one server has multiple addresses or
interfaces on the same subnet.

The sad truth is that in order to properly support multi-homed hosts, Rx
needs to be fixed so that it identifies all available interfaces, binds
a separate socket for each interface, and keeps track of to which
interface an incoming connection belongs, so that it can send responses
out the same interface.  This approach is necessarily used by all major
UDP-based services (e.g. DNS, NTP, DHCP), as it is the only way to
insure correct behavior on a multi-homed host.

-- Jeff