[OpenAFS] Re: Redundant Internet links

Thu, 16 Dec 2010 10:12:23 -0600

On Thu, 16 Dec 2010 01:55:31 +0100
Jaap Winius <jwinius@umrk.nl> wrote:

> Regarding the use of the redundant links in the above scenario, would
> it be possible -- desirable or wise even -- to present the clients,
> which have empty CellServDB files, with DNS AFSDB RRs that point to
> five different IP addresses: one internal for the local machine and
> four external for the two remote ones?

Is the "internal" address only routable from that site? Clients (OpenAFS
clients, anyway) will try to ask the dbservers for which dbserver is the
sync site when performing a write operation to whatever db; I'm not sure
how they behave if a client gets an address it doesn't recognize as an
answer.

If you're just taking the "real" sites and adding additional IPs,
though, I think the clients should be fine. Aside from the
aforementioned sync site determination, it's effectively just a list of
sites to try to contact.

> As for the OpenAFS DB/file servers, I believe they cannot use DNS  
> discovery in the same way, so would it be possible to configure their  
> CellServDB files with two IP addresses per remote server?

For the purposes of talking to dbservers, fileservers are similar to
clients.

dbservers are different, though; the specification in CellServDB affects
the voting algorithm, and is not as simple as a list of sites to try to
reach. I don't think I'd recommend putting multiple IPs per site in
there, but I don't really know what happens if you do that. Each site is
aware of the local addresses of the other sites, and I'm not sure what a
site does if it sees that another site claims ownership of more than one
CellServDB server entry... But I don't imagine it ending well.

> If that's not an option, it will always be possible to fall back on a
> general routing solution, which is going to be configured anyway, but
> it would be cool if OpenAFS could, at all times, determine
> independently how best to make use of the available links.

Well, even if you did this AFS-side and got it to work, there isn't much
runtime automatic tuning of available sites or anything like that. The
benefit you would get is just that if one link goes down to a site,
clients will notice and fail-over, possibly failing over the same site.

As to whether this is advisable in general... it's not immediately clear
to me what failure case you're trying to cover here. If you have three
geographically-diverse sites, losing access to one of those db sites
will not significantly affect accessing AFS beyond an initial delay when
the site goes away. If you lose two sites, then you lose the ability to
make vldb or ptdb changes, but you can still access data (assuming the
relevant fileservers are still accessible).

So, redundant links to the various db sites may gain you some added
uptime in the event that both remote sites have exactly one link down. I
don't know much about your environment, but just from that, it doesn't
seem worth the effort of running with such a non-typical configuration.
While it may be possible to make it work, providing the redundancy at a
lower level seems easier.

What would seem to have more immediate benefit is to have your
fileservers register (at least) two IP addresses in the vldb; one for
each redundant link. That way, if one link to the site containing that
fileserver goes down, clients should still be able to access the data on
that fileserver. Of course, this should already happen automatically if
the two IPs for the fileserver are addresses known by its local
interfaces (that is, they show up in 'ifconfig' or whatever).

-- 
Andrew Deason
adeason@sinenomine.net