[OpenAFS] Help needed for receovery of data of inode fileserver
(Solaris 10 x86)
Hartmut Reuter
reuter@rzg.mpg.de
Fri, 04 Apr 2008 15:43:44 +0200
Jeffrey Altman wrote:
> Hartmut Reuter wrote:
>
>> Jeffrey Altman wrote:
>>
>>> Hartmut Reuter wrote:
>>>
>>>>> So what is the value of 'class' if not vLarge?
>>>>>
>>>> As you can see from that line above it's vSmall:
>>>>
>>>> >> [6] DistilVnodeEssence(rwVId = 536870912U, class = 1, ino =
>>>> >> 21977313U, maxu = 0x8046bc4), line 3175 in "vol-salvage.c"
>>>>
>>>> So there might be really some thing wrong with the SmallVnodeFile,
>>>> but to do an AssertionFailed is not the best way to repair it!
>>>
>>>
>>>
>>> What the AssertionFailed means is that no one has written code to
>>> deal with a case where this error has occurred. It can't be
>>> fixed with Salvager until someone writes the missing code.
>>
>>
>> Of course, but for the user it might be better to skip handling of
>> this error and to continue with the next vnode. So he could get back
>> at least the damaged volume and copy whatever is still accessible.
>>
>> So John, ifdef line 3175 and recompile. If this was a single bad vnode
>> your volume may come online again, otherwise it's probably lost anyway.
>>
>> Hartmut
>
>
> I disagree. The reason that assert is there is that continuing
> will cause more damage to the data. We do not know based upon
> the available data whether this is a single bad vnode or whether
> perhaps the wrong file is being reference for the SmallVnodeFile.
>
> What is known is that one vnode, perhaps the first vnode examined
> has completely valid data except for the fact that it is in the
> wrong file.
>
> There are several issues that are worth pursuing here. Especially
> because whatever the problem is has begun occurring on multiple machines:
>
> 1. what is the actual damage that has taken place?
>
> 2. can the damage be correct?
>
> 3. can the damage be avoided in the first place? What is the cause?
>
> Jeffrey Altman
Of course we should not remove the assert() forever, but just for the
test of this volume which otherwise probably will be lost anyway.
In MR-AFS we had a -nowrite option to do just a dry-run. I admit that
it's a lot work to implement this, but some times it is very helpful.
Hartmut
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
--
-----------------------------------------------------------------
Hartmut Reuter e-mail reuter@rzg.mpg.de
phone +49-89-3299-1328
fax +49-89-3299-1301
RZG (Rechenzentrum Garching) web http://www.rzg.mpg.de/~hwr
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------