[OpenAFS] File server, bos salvage hang

Russ Allbery rra@stanford.edu
Fri, 05 Nov 2004 20:40:35 -0800


Miles Davis <miles@cs.stanford.edu> writes:

> On occasion, we have the classic gconf problem, where for reasons I
> don't know (but have heard have been fixed in 1.3.X) where a user's
> gconf lock file .gconfd/lock/ior becomes corrupt and/or unusable,
> requiring a salvage of the volume. Normally, not a big deal, it happens
> only rarely. However, I've got a file server that I can no longer
> salvage volumes on; Running bos salvage <server> <part> <vol> never
> finishes, and the file server is never quite the same again until a
> restart (killing the file server) or reboot. By "never quite the same" I
> mean things like 'vol listvol' fails, though the file server it sill
> working for volume other than the one being salvaged. I haven't seen
> this behaviour with any of our other file servers, ever.

Do all vos commands start failing on that file server?  If so, you may
have run into the same problem that we just ran into with the campus AFS
servers.  We needed to upgrade all of them with a patch to the vos server.

You can get the patch at:

    /afs/ir.stanford.edu/dev/afs/openafs/PATCHES/openafs/vospatch

I believe it's a bug fix that is already in 1.3.x and will be in the next
stable (or already was in a stable version newer than what we happened to
be running).

-- 
Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>