[OpenAFS] Re: OpenAFS maintenance

Andrew Deason adeason@sinenomine.net
Thu, 7 Apr 2011 15:27:44 -0500


On Thu, 7 Apr 2011 15:13:24 -0400
"Dvorkin, Asya" <dvorkias@umdnj.edu> wrote:

> At my current position, I was given an already running Centos server
> with a cell setup on it, which is up and running.  I know how to do
> basic maintenance (increase quotas, check permissions).  My main
> concern is how can I prepare for emergencies?  What can I do/learn in
> advance with a running/working openafs setup that would "train" me for
> when something will actually go wrong?

If you only have one server, you may want to practice constructing a new
server and cell entirely from backups. That should tell you how easily
you can recover from a disaster scenario, and whether your backups are
actually adequate. But that concept isn't really AFS-specific.

If you set up more fileservers or database servers (which tends to be a
good idea for other reasons anyway), you can move/create 'test' volumes
to them and experiment with known failure scenarios. e.g. see what
happens when a fileserver or dbserver drops off the network, when
garbage is written to /vicep*, or when you pull the power cord or 'kill
-9' AFS daemons. You could even set up a separate "test" cell and realm
depending on how much effort and resources you want to spend on it. Some
sites find such a thing also useful for testing new versions, new
functionality, etc.

And, of course, you can "be prepared" by purchasing a commercial support
contract :) <http://www.openafs.org/support.html>.

-- 
Andrew Deason
adeason@sinenomine.net