[OpenAFS] File server, bos salvage hang

Miles Davis miles@cs.stanford.edu
Fri, 5 Nov 2004 20:25:47 -0800


On Fri, Nov 05, 2004 at 11:23:32PM -0500, Derrick J Brashear wrote:
> On Fri, 5 Nov 2004, Miles Davis wrote:
> 
> >
> >On occasion, we have the classic gconf problem, where for reasons I don't
> >know (but have heard have been fixed in 1.3.X) where a user's gconf lock
> 
> i don't believe so

Dang. OK, my imagination. :)

> 
> >file .gconfd/lock/ior becomes corrupt and/or unusable, requiring a salvage
> >of the volume. Normally, not a big deal, it happens only rarely. However,
> >I've got a file server that I can no longer salvage volumes on; Running
> >bos salvage <server> <part> <vol> never finishes, and the file server is
> >never quite the same again until a restart (killing the file server) or
> >reboot. By "never quite the same" I mean things like 'vol listvol' fails,
> >though the file server it sill working for volume other than the one being
> >salvaged. I haven't seen this behaviour with any of our other file
> >servers, ever.
> 
> does it start? does the volume actually go offline?

You mean the salvage? It seems to start -- SalvageLog says

@(#) OpenAFS 1.2.11 built  2004-01-14
11/05/2004 19:46:03 STARTING AFS SALVAGER 2.4 (/usr/afs/bin/salvager /vicepb 536873343)

but that's it. The volume is inaccessible, so it seems offline to me. I 
don't know how to tell the state outside of using vos, and that's hosed.

-- 
// Miles Davis - miles@cs.stanford.edu - http://www.cs.stanford.edu/~miles
// Computer Science Department - Computer Facilities
// Stanford University