[OpenAFS] unresponsive clients after lowest ip database server went down

Garance A Drosehn drosih@rpi.edu
Fri, 25 Sep 2015 13:11:17 -0400


On 27 Aug 2015, at 16:35, Jonathan Leung-Nilsson wrote:

> Hello,
>
> I have recovered from this situation already, but I was curious to hear if
> others have tested or experienced this issue as well:
>
> If the AFS database server with the lowest IP address goes down or is
> offline, but there are 2 other database servers available, then are clients
> and the remaining servers supposed to be able to handle that situation
> gracefully? We had an incident where this happened (in our case, the
> database server was taken offline because the switch died), and then it
> appeared that AFS access (simply "ls /afs/<cellname>/") and vos commands
> were unresponsive.

> So I am mainly wondering if this is expected - if OpenAFS depends on having
> its lowest IP address server online all the time - or if it's likely that
> we have a configuration issue in our cell. I setup our cell about 5 years
> ago as a complete newbie to OpenAFS, and while I've gained a lot of
> insights and experience since, I still don't understand all the nuances.

I don't know how severe your situation was, but there's a similar situation
that I discussed back in October.  Here are two of the helpful answers to
my question back then:

https://lists.openafs.org/pipermail/openafs-info/2014-October/041085.html

https://lists.openafs.org/pipermail/openafs-info/2014-October/041086.html

I don't know if this helps to explain your situation, but note that the
issue mentioned in my thread happens when *any* database server goes
offline.

-- 
Garance Alistair Drosehn                =     drosih@rpi.edu
Senior Systems Programmer               or   gad@FreeBSD.org
Rensselaer Polytechnic Institute;             Troy, NY;  USA