[OpenAFS] AFS outage, impact of "moving" root.cell.readonly, root.afs.readonly
Thu, 26 Apr 2007 11:30:31 -0600
Yesterday I removed one of multiple instances of root.cell.readonly
(from file server X) and one of multiple instances of root.afs.readonly
(from file server X also.)
Almost exactly two hours later a number of AFS clients could not access
/afs and/or /afs/<local cell>, and the number of affected clients
increased over the next thirty minutes or so.
My expectation was that the clients would adjust to the removal of one
instance of the ROs. They apparently did not.
I mounted <local cell> from a client in a different cell -- one that
most likely did not have any volume location information from <local
cell> and confirmed that all AFS volumes were on line and available -- I
was able to walk the tree mounted to <localcell>:root.afs
To clear up the confusion on the client side I restarted (I sure like
fast restart) all file servers and we returned to normal within five
Where did I go wrong?