[OpenAFS] Fail over to replica sites

Russ Allbery rra@stanford.edu
Thu, 08 Aug 2002 16:39:02 -0700


Are other people having trouble with OpenAFS's failover to replica sites
when one server goes down?  We had one of our main replication servers
(that also holds the read/write versions of many of the volumes) go down
today, and rather than falling over to another server (even after a
delay), we had quite a few systems that started just reporting "connection
timed out" on any paths located in our AFS cell.

This seems less than ideal, and sort of punches a hole in the AFS
reliability feature.  To have the client cache report "connection timed
out" on a replicated volume when it hasn't tried all of the replicas
strikes me as simply wrong....

-- 
Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>