[OpenAFS] recover data from corrupted volume

Stephan Wiesand stephan.wiesand@desy.de
Wed, 16 Jan 2013 19:11:39 +0100


Hi Dmitris,

On Jan 16, 2013, at 18:25 , Dimitris Z wrote:

> Hi,
>=20
> I have been asked to attempt recovery an old soon to be decommissioned
> openafs server. It is running openafs-server-1.4.7-68.2.SL4.x86_64 on

that was a fine release at its time. For the wizards: on top of the =
1.4.7 release, it had these patches from CVS applied:

linux-hlist-unhashed-opencoding-20080520
butc-xbsa-lwp-protoize-damage-20080501
uuid-corrected-duplicate-check-20080501
viced-large-more-threads-20080506
vos-sync-flag-voltype-properly-20080521

+ backports of the patches to address OPENAFS-SA-2009-001 and -002 =
(affecting the client only). Thus, not too exotic.

> Scientific Linux 4.4.
>=20
> The RAID10 array (3ware Inc 9550SX)  where vipepX partitions lie went
> completely missing while replacing a faulty drive (ECC error). We put
> the fauled drive back inside in order to be able to access the array
> again and rsycned all the vicepX date elsewhere. This was completed
> successfully. Errors logged meanwhile in the kernel logs mentioned
> sectors being repaired.
>=20
> The array was then deleted and remade with new disks and the data was
> rsynced back. However some volumes have not come online. Salvaging
> them causes them to deleted. These volumes have been off-line before
> the disk replacement attemp - so it is likely they have been corrupted
> for some time.
>=20
> bash-3.2$ bos salvage -server afs1 -partition vicepe -volume uk  =
-showlog
> Starting salvage.
> bos: waiting for salvage to complete.
> bos: salvage completed
> SalvageLog:
> @(#) OpenAFS 1.4.7 built 2009-04-07 (68.2.SL4@fnal.gov)
> 01/16/2013 17:14:45 STARTING AFS SALVAGER 2.4 (/usr/afs/bin/salvager
> /vicepe 536874966)
> 01/16/2013 17:14:45 2 nVolumesInInodeFile 64
> 01/16/2013 17:14:45 CHECKING CLONED VOLUME 0.
> 01/16/2013 17:14:45 uk (536874966) updated 01/09/2013 17:13
> 01/16/2013 17:14:45 No header file for volume 0
> 01/16/2013 17:14:45 totalInodes 6446
> -bash-3.2$ bos salvage -server afs1 -partition vicepe -volume uk  =
-showlog
> Starting salvage.
> bos: salvage completed
> SalvageLog:
> @(#) OpenAFS 1.4.7 built 2009-04-07 (68.2.SL4@fnal.gov)
> 01/16/2013 17:15:22 STARTING AFS SALVAGER 2.4 (/usr/afs/bin/salvager
> /vicepe 536874966)
> 01/16/2013 17:15:22 The volume header file V0536874966.vol is not
> associated with any actual data (deleted)
> 01/16/2013 17:15:22 No applicable vice inodes on vicepe; not salvaged
> 01/16/2013 17:15:22 0 nVolumesInInodeFile 0
> Temporary file /vicepe/salvage.inodes.vicepe.16945 is missing...
>=20
> /usr/sbin/vos examine uk
> Could not fetch the information about volume 536874966 from the server
> : No such device
> Volume does not exist on server afs1 as indicated by the VLDB
>=20
> Dump only information from VLDB
>=20
> uk
>    RWrite: 536874966
>    number of sites -> 1
>       server afs1 partition /vicepe RW Site
>=20
>=20
> My question is, are there any other options to at least partially
> recover data from the offline volumes? I do have my last rsync of
> vicepe where this volume was. It was probably damaged before it was
> backed up but even a partial data recovery would help.

If there are remnants left, you should find them in AFSIDat/K1/Kz++U . =
If you can't find that directory, look for the files named zzzz5Mx1++0, =
zzzz9Mx1++0, zzzzDMx1++0, zzzzPMx1++0, possibly in lost+found. If you =
can't find any of those, I'm out of ideas/hope.

Best regards,
	Stephan


--=20
Stephan Wiesand
DESY -DV-
Platanenenallee 6
15738 Zeuthen, Germany