[OpenAFS] File server, bos salvage hang

Miles Davis miles@cs.stanford.edu
Fri, 5 Nov 2004 21:27:19 -0800


On Fri, Nov 05, 2004 at 08:40:35PM -0800, Russ Allbery wrote:
> Miles Davis <miles@cs.stanford.edu> writes:
> 
> > On occasion, we have the classic gconf problem, where for reasons I
> > don't know (but have heard have been fixed in 1.3.X) where a user's
> > gconf lock file .gconfd/lock/ior becomes corrupt and/or unusable,
> > requiring a salvage of the volume. Normally, not a big deal, it happens
> > only rarely. However, I've got a file server that I can no longer
> > salvage volumes on; Running bos salvage <server> <part> <vol> never
> > finishes, and the file server is never quite the same again until a
> > restart (killing the file server) or reboot. By "never quite the same" I
> > mean things like 'vol listvol' fails, though the file server it sill
> > working for volume other than the one being salvaged. I haven't seen
> > this behaviour with any of our other file servers, ever.
> 
> Do all vos commands start failing on that file server?  If so, you may
> have run into the same problem that we just ran into with the campus AFS
> servers.  We needed to upgrade all of them with a patch to the vos server.

Yup, all vos commands are useless after it starts. Any idea what triggers 
it, or how to prevent it? Or is it just random luck that few have seen it 
sofar?

-- 
// Miles Davis - miles@cs.stanford.edu - http://www.cs.stanford.edu/~miles
// Computer Science Department - Computer Facilities
// Stanford University