[OpenAFS] Re: Reoccuring salvager errors/fixes

Andrew Deason adeason@sinenomine.net
Fri, 11 Feb 2011 13:33:27 -0600


On Fri, 11 Feb 2011 19:54:41 +0100
Matthias Gerstner <matthias.gerstner@esolutions.de> wrote:

> > It shouldn't. The salvager itself will take a volume offline if run
> > against a single volume. All 'bos salvage' does for single-volume
> > requests is run the salvager. How are you normally scheduling these
> > salvages?
> 
> Its just that one time I ran "salvager" for a single volume
> interactively and that somehow screwed things up. I think I've read
> something about the salvager regarding volume locking in some man page
> but can't seem to find that part right now.

Okay, but if you can repeat that happening (with a test volume or
something), it could be helpful in identifying any possible problems.

Running the salvager manually for a whole partition or whole server
would indeed be bad if the fileserver was running; don't do that! But
for a single volume that _shouldn't_ be the case (in fact, if the
fileserver is not running, the salvager will not do anything for a
single-vol salvage)

> Anyway, right now I run the following command via a cron job once a
> week:
> 
> "bos salvage -server "server" -showlog -all -orphans remove"
> 
> I've just added the "-orphans remove" thanks to your hint.

Okay, but keep in mind that does delete data forever. Usually people I
talk to prefer to attach the data for examination when they find that
there is orphaned data, but that's completely up to you.

> There's currently only one partition and it is using ext3:
> 
> mount | grep vicep
> /dev/mapper/raid1-afs.vicepa on /vicepa type ext3 (rw,noexec,nosuid,nodev,noatime)

I don't suppose you could try fsck'ing the fs to see if anything odd
comes up at that level, could you?

And that seems fine; though if you just want _something_ to try, you
could see if mounting without those extra options changes anything. I
cannot think of any possible reason any of those options would affect
what the fileserver does, and I'm pretty sure many installations use the
same options, but if you want to be sure... Did you do anything
non-default with the options when creating the fs, or anything?

Also, can you double-check that nothing is crashing or getting killed,
etc, when you don't think it is? Check BosLog and make sure you don't
see the fileserver exiting on signals, or processes restarting at times
you don't expect. While salvages should fix any problems from that
automatically, it's just another thing to check.

-- 
Andrew Deason
adeason@sinenomine.net