[OpenAFS] Help needed for receovery of data of inode fileserver (Solaris 10 x86)

Jeffrey Altman jaltman@secure-endpoints.com
Fri, 04 Apr 2008 09:26:33 -0400


Hartmut Reuter wrote:
> Jeffrey Altman wrote:
>> Hartmut Reuter wrote:
>>
>>>> So what is the value of 'class' if not vLarge?
>>>>
>>> As you can see from that line above it's vSmall:
>>>
>>>  >>   [6] DistilVnodeEssence(rwVId = 536870912U, class = 1, ino =
>>>  >> 21977313U, maxu = 0x8046bc4), line 3175 in "vol-salvage.c"
>>>
>>> So there might be really some thing wrong with the SmallVnodeFile, 
>>> but to do an AssertionFailed is not the best way to repair it!
>>
>>
>> What the AssertionFailed means is that no one has written code to
>> deal with a case where this error has occurred.   It can't be
>> fixed with Salvager until someone writes the missing code.
> 
> Of course, but for the user it might be better to skip handling of this 
> error and to continue with the next vnode. So he could get back at least 
> the damaged volume and copy whatever is still accessible.
> 
> So John, ifdef line 3175 and recompile. If this was a single bad vnode 
> your volume may come online again, otherwise it's probably lost anyway.
> 
> Hartmut

I disagree.   The reason that assert is there is that continuing
will cause more damage to the data.  We do not know based upon
the available data whether this is a single bad vnode or whether
perhaps the wrong file is being reference for the SmallVnodeFile.

What is known is that one vnode, perhaps the first vnode examined
has completely valid data except for the fact that it is in the
wrong file.

There are several issues that are worth pursuing here.  Especially 
because whatever the problem is has begun occurring on multiple machines:

1. what is the actual damage that has taken place?

2. can the damage be correct?

3. can the damage be avoided in the first place?  What is the cause?

Jeffrey Altman