[OpenAFS] Summary of recommended configuration options from the workshop

Robert Banz rob@nofocus.org
Mon, 26 May 2008 15:38:52 -0700


>>
>
> From the conference:
> Why Derrick doesn't use fastrestart
> 1) You have to have something parse logfiles and salvage it
> 2) If you're running an inode fileserver, every time you salvage you  
> crawl all of the inodes. You salvage 10 volumes, you're going  
> through 10*<number of inodes>

I agree with Derek's analysis -- that yes, in the event you'd really  
have to salvage, you could salvage a lot.

However, in my experience, salvaging has only been necessary in the  
face of a hard system crash -- basically, problems with data written  
to the filesystem out-of-order from what AFS thinks it should be, etc.  
If you're unlucky to be running in an environment where your storage  
is unstable, or your filesystem doesn't guarantee (or close to it)  
ordered writes, you've got other problems. Though, I'd say it's very  
reasonable to salvage after a hard crash -- perhaps that's a job for  
an init script, or the administrator that was investigating the cause  
of the failure.

In most situations where I was running into that required a fileserver  
restart with prejudice (kill -9) -- things like thread lockups -- I've  
never had to salvage, and fastrestart is a lifesaver when you have  
fileservers with a good deal of data.  Customers don't enjoy 30+  
minutes of outage.

-rob