[OpenAFS] Re: 1.6.0pre2 - more vos issues, possible bug
Fri, 4 Mar 2011 17:20:34 -0500 (EST)
On 2011-03-04 at 15:59, Andrew Deason ( firstname.lastname@example.org ) said:
> What about the command immediately preceding this? Anything odd about
> it; time it took to execute, or any warnings/errors/etc?
The commands before that all completed in 30 seconds or less. No messages
other than that.
>> I'm not sure how related this is to the other issue I saw, where the
>> backup clone was left in a much worse state.
> I don't think it is; that error above isn't even really much of a
> problem; we just failed to end the transaction, but the the transaction
> is idle by that point and will be ended automatically after 5 minutes
> (as you see in the VolserLog).
> The first issue you reported had problems much earlier before the log
> messages you gave. Did anything happen to the backup volume before that?
> No messages referencing that volume id? Did you or someone/thing else
> remove the backup clone or anything?
Nope. We don't even access the backup volume when doing the file-level
> The first messages around Tue Mar 1 00:02:12 2011 look like what would
> happen if you tried to recreate the BK after it was deleted with that
> code (fixed in the patches I mentioned before). The subsequent salvages
> are from an error to read some header data, which could be explained by
> the attempted 'zap's and such, assuming those messages were during/after
> you noticed the volume being inaccessible and tried forcefully deleting
Yes, the zaps were me trying to get the .backup into a usable state.
Though, the first string of salvages started in the middle of the
afternoon without any intervention - I think the event that caused them
is what's missing from the picture.
I'm still a little hesitant to bos salvage that server - whole reason
we're trying to switch to DAFS is to avoid the multi-hour fileserver
I'm going to take some time either later tonight, or early next week to go
back through the logs and try to make more sense of them from a
chronological standpoint, and see if there's anything I missed.
There's still a bug somewhere that causes a .backup volume to go off-line
after being created. I have a test volume on one of the problem
fileservers right now, that's been vos backup'd once a minute since
yesterday without a problem. So, something else must have to happen to
cause this, just not sure what.
> Andrew Deason
> OpenAFS-info mailing list