[OpenAFS] unresponsive clients after lowest ip database server went down

Garance A Drosehn drosih@rpi.edu
Fri, 25 Sep 2015 13:11:17 -0400

On 27 Aug 2015, at 16:35, Jonathan Leung-Nilsson wrote:

> Hello,
> I have recovered from this situation already, but I was curious to hear if
> others have tested or experienced this issue as well:
> If the AFS database server with the lowest IP address goes down or is
> offline, but there are 2 other database servers available, then are clients
> and the remaining servers supposed to be able to handle that situation
> gracefully? We had an incident where this happened (in our case, the
> database server was taken offline because the switch died), and then it
> appeared that AFS access (simply "ls /afs/<cellname>/") and vos commands
> were unresponsive.

> So I am mainly wondering if this is expected - if OpenAFS depends on having
> its lowest IP address server online all the time - or if it's likely that
> we have a configuration issue in our cell. I setup our cell about 5 years
> ago as a complete newbie to OpenAFS, and while I've gained a lot of
> insights and experience since, I still don't understand all the nuances.

I don't know how severe your situation was, but there's a similar situation
that I discussed back in October.  Here are two of the helpful answers to
my question back then:



I don't know if this helps to explain your situation, but note that the
issue mentioned in my thread happens when *any* database server goes

