[OpenAFS] Recovering vice partitions after a crash

Christopher Bayliss c.bayliss@nesc.gla.ac.uk
Mon, 16 Jun 2008 16:29:41 +0100


Hi,

We have been running a small AFS server as part of a research project
for a few months everything has been fine until Solaris choked on a
patch last week and thoroughly killed itself. Having rebuilt the system
I'm having trouble restarting the server with its old partitions.

We are running the 1.4.6 namei binaries on a Sun X4500 under Solaris 10.
There are 5 100GB vice partitions on ZFS. Of the 5 one was never used,
two were test/scratch space, one was user volumes and the remainder had
the main project file store as well as root.afs and root.cell.

I can recover the user volumes with bos salvage on their partition.
However the project volume, and all the other volumes, remain lost.

If I run salvage on the partition I get

> 06/16/2008 15:24:39 STARTING AFS SALVAGER 2.4 (/usr/afs/bin/salvager -f /vicepa)
> 06/16/2008 15:24:39 SALVAGING FILE SYSTEM PARTITION /vicepa (device=vicepa)
> 06/16/2008 15:24:39 ***Forced salvage of all volumes on this partition***
> 06/16/2008 15:24:41 6 nVolumesInInodeFile 168 
> 06/16/2008 15:24:41 CHECKING CLONED VOLUME 32867307.
> 06/16/2008 15:24:41 root.afs (536870927) updated 01/22/2008 16:10
> 06/16/2008 15:24:41 No header file for volume 32867307
> 06/16/2008 15:24:41 totalInodes 6
> 06/16/2008 15:24:41 CHECKING CLONED VOLUME 32867307.
> 06/16/2008 15:24:41 root.cell (536870930) updated 06/03/2008 13:19
> 06/16/2008 15:24:41 No header file for volume 32867307
> 06/16/2008 15:24:41 totalInodes 79
> 06/16/2008 15:24:41 CHECKING CLONED VOLUME 32867307.
> 06/16/2008 15:24:41 nanocmos.root (536870939) updated 06/11/2008 10:20
> 06/16/2008 15:24:41 No header file for volume 32867307
> 06/16/2008 15:24:41 totalInodes 113211
> 06/16/2008 15:24:48 SALVAGING OF PARTITION /vicepa COMPLETED

The FileLog has
> Mon Jun 16 15:24:49 2008 VAttachVolume: Error reading diskDataHandle vol header /vicepa/V0536870939.vol; error=101
> Mon Jun 16 15:24:49 2008 VAttachVolume: Error attaching volume /vicepa/V0536870939.vol; volume needs salvage; error=101

If I run bos salvage on the specific partition.
> 06/16/2008 15:49:18 STARTING AFS SALVAGER 2.4 (/usr/afs/bin/salvager /vicepa 536870939)
> 06/16/2008 15:49:18 The volume header file V0536870939.vol is not associated with any actual data (deleted)
> 06/16/2008 15:49:18 No applicable vice inodes on vicepa; not salvaged
> 06/16/2008 15:49:18 0 nVolumesInInodeFile 0 
> Temporary file /vicepa/salvage.inodes.vicepa.1724 is missing...

Which deletes all the data. Fortunately I took a snapshot before I tried
to recover anything, But after the crash, so I can roll back. All the
unrecoverable volumes do essentially the same thing when you try to
salvage them.

Repeated ZFS scrubs don't show any errors so I don't think any of the
data was corrupted in the crash unless a write was interrupted.

The old FileLog has entries like this at the end.
> 388: dev 2d50002, inode 131926, length 0, type/mode 800

Is this recoverable or should we just start again? While we really would
like to recover this data most of the stuff that's lost is can be
recreated.

Many thanks.

	Chris

-- 
The University of Glasgow, charity number SC004401