[OpenAFS] Re: Redundant Internet links
Thu, 16 Dec 2010 19:00:33 -0600
On Thu, 16 Dec 2010 23:39:55 +0100
Jaap Winius <email@example.com> wrote:
> I figure that it will be best if each OpenAFS server is only known by
> its public IP address(es), even to the clients on its internal
> network. This way, each OpenAFS client at each location in the cell
> will always contact the three OpenAFS servers using the same set of
> public IP addresses.
Yes, that's the only way to do it. We don't provide the tools for a
split-horizon vldb (yet, anyway).
> > If you're just taking the "real" sites and adding additional IPs,
> > though, I think the clients should be fine. Aside from the
> > aforementioned sync site determination, it's effectively just a list
> > of sites to try to contact.
> Okay, so I could give all of the clients a list of six IP addresses
> for the three servers. I plan to use DNS with AFSDB RRs for this,
> although that doesn't provide a method for giving priority to the
> faster links. However, I recently noticed RFC5864 (good work, Russ!):
> what's the minimum OpenAFS version necessary for anyone wanting to use
> SRV RRs for OpenAFS instead?
Well, you can always set the preferences for the servers on each client
by way of 'fs setserverprefs -vlservers'.
But SRV support was first added in 1.5.66, I think. But there have been
a few issues; I believe all of the known ones were fixed as of 1.5.73.
It's not in 1.4; you probably want to wait for the 1.6 release for it.
And I don't think we take into account the SRV priority/weight when
determing db sites for the kernel, but I could very well be wrong there.
> The redundant links will significantly reduce the risk for each site
> that, if it temporarily loses its connection to the Internet, the
> local OpenAFS file server also becomes read-only until the link is
To be clear, the fileserver does not become readonly; what becomes
readonly are the databases that contain volume location information and
authenticated user metadata. So, that means you can read and write to
files to any fileserver you can reach, but you cannot create, remove, or
release volumes, create, remove, or alter users/groups, or anything else
that requires modifying those databases.
> > What would seem to have more immediate benefit is to have your
> > fileservers register (at least) two IP addresses in the vldb; one
> > for each redundant link. That way, if one link to the site
> > containing that fileserver goes down, clients should still be able
> > to access the data on that fileserver.
> Would it then not be necessary to have the database and file servers
> installed on separate hosts? Otherwise they would share the same
> CellServDB files (even with a single physical machines, that's
> possible using virtual hosts, although I would not be able to afford
> two public IPv4 addresses for both hosts). Or is that not what you
No. Consider geographic sites A, B, and C. You're at site A and you want
to get some files in a volume that's only on a fileserver at site B, and
one of the links A<->B breaks. You contact the vlserver at site A, and
it will tell you that the volume is on a fileserver at site B, and it
will also tell you all known IP addresses for the fileserver at site B.
If the fileserver at B only has one IP registered, and the link for that
IP is the one that went down, you can't contact the B fileserver. But if
the B fileserver has two IP addresses, you may try to contact one of
them, fail, and then retry on the other known IP address on the working
link and succeed. This is why you see the messages like
"multi-homed address; other same-host interfaces maybe up"
"all multi-homed ip addresses down for the server"
from the client when it loses contact with a server.
> > Of course, this should already happen automatically if
> > the two IPs for the fileserver are addresses known by its local
> > interfaces (that is, they show up in 'ifconfig' or whatever).
> I plan to use the "iproute" package in Debian.
Well 'ip route' or whatever. I just mean, if the address is locally
bound to the server so it knows its own addresses; it's not getting the
extra addresses from a NAT or load-balancer or something.