[OpenAFS] Re: 1.6.0pre2 - more vos issues, possible bug

Andrew Deason adeason@sinenomine.net
Fri, 4 Mar 2011 15:59:08 -0600

On Fri, 4 Mar 2011 16:23:34 -0500 (EST)
Andy Cobaugh <phalenor@gmail.com> wrote:

> Ok, an update to the problem I alluded to this morning.
> Volume name in question is pub.m.rpmforge. The .backup volume in 
> particular. This volume was backup'd this morning at approx. 0005, with 
> this output from vos backup:
> Failed to end the transaction on the rw volume 536873153 
> ____: server not responding promptly
> Error in vos backup command. 
> ____: server not responding promptly
> That command returned in <5s. I then see this in VolserLog:

What about the command immediately preceding this? Anything odd about
it; time it took to execute, or any warnings/errors/etc?

> I'm not sure how related this is to the other issue I saw, where the
> backup clone was left in a much worse state.

I don't think it is; that error above isn't even really much of a
problem; we just failed to end the transaction, but the the transaction
is idle by that point and will be ended automatically after 5 minutes
(as you see in the VolserLog).

The first issue you reported had problems much earlier before the log
messages you gave. Did anything happen to the backup volume before that?
No messages referencing that volume id? Did you or someone/thing else
remove the backup clone or anything?

The first messages around Tue Mar  1 00:02:12 2011 look like what would
happen if you tried to recreate the BK after it was deleted with that
code (fixed in the patches I mentioned before). The subsequent salvages
are from an error to read some header data, which could be explained by
the attempted 'zap's and such, assuming those messages were during/after
you noticed the volume being inaccessible and tried forcefully deleting

Andrew Deason