[OpenAFS] back off and try again later at AFS client boot when servers inaccessible

Paul Blackburn mpb@est.ibm.com
Thu, 08 Aug 2002 10:12:28 +0100


Greetz,

I have noticed that (on AIX) if you try to start AFS client
when the database servers are down (or inaccessible) it can cause problems.

A work-around that works for most situations is to add
a simple ping test to the run-command that starts AFS Cache Manager
at boot time (Linux example: 
http://www.angelfire.com/hi/plutonic/images/afs ).

However, there is one situation where this approach does not work:
site power outage.

For site power outages, I manually check that AFS database servers
are running OK first and then start (or re-start) AFS client machines.

It seems to me that recovery from events like site power failures
could be more robust if AFS clients had a boot time run-command
script that used a "back off and try again later" approach so that
if the database servers are inaccessible then clients will be automagically
restarted at some later time.

Has anyone else found a good way to deal with the recover from site 
power failure?
--
cheers
paul                         http://acm.org/~mpb

"It is practically impossible to teach good programming to students
 that have had a prior exposure to BASIC; as potential programmers
 they are mentally mutilated beyond hope of regeneration."

  --Edsger Dijkstra  1930-2002   
http://www.digidome.nl/dijkstra_quotations.htm