[OpenAFS] help, salvaged volume won't come back online, is it corrupt? [trimmed log]

John Koyle jkoyle@koyle.org
Tue, 12 Sep 2006 22:22:35 -0600


Adam Megacz wrote:
> Well, this is it, the day I'd always feared...
>
> Server is running openafs-1.4.1.  I did a "vos remove" on
> root.cell.backup and suddenly couldn't access root.cell.  Shut down
> the fileserver, started it back up, and not it can't attach:
>
>   $ vos examine 536870912
>   **** Could not attach volume 536870912 ****
>
>       RWrite: 536870912     Backup: 536870914 
>       number of sites -> 1
>          server research.CS.Berkeley.EDU partition /vicepa RW Site 
>
>
> Everything for this cell is in one 3.3gb volume (root.cell); I hadn't
> yet gotten around to splitting it out into separate volumes.
>
> Help!  Is there any way to get 'bos salvage' to make the volume
> attachable?  Or in any other way recover the files or some subset
> thereof?
>
> The good news is that 'du -chs' of /vicepa gives exactly the right
> size (to within a few meg), so maybe I have reason to hope that my
> data is still there somewhere...
>
> SalvageLog (below) looks like it's stopping at /atmelx and getting
> upset.  I'm quite happy to nuke that directory since it hasn't changed
> since the last dump.  But other very, very important stuff has changed.
>
> Help!
>
>   - a
>
>
> ______________________________________________________________________________
> SalvageLog:
>
> @(#) OpenAFS 1.4.1 built  2006-06-18 
> 09/12/2006 20:27:15 STARTING AFS SALVAGER 2.4 (/usr/lib/openafs/salvager /vicepa 536870912)
> 09/12/2006 20:27:15 Found 0 link count file /vicepa/AFSIDat/+/++++U/+/+/1++++I=.
> 09/12/2006 20:27:15 Found 0 link count file /vicepa/AFSIDat/+/++++U/+/+/2++++gV3.
> 09/12/2006 20:27:15 Found 0 link count file /vicepa/AFSIDat/+/++++U/+/+/3++++kC4.
> 09/12/2006 20:27:15 Found 0 link count file /vicepa/AFSIDat/+/++++U/+/+/5++++oI.
> 09/12/2006 20:27:15 Found 0 link count file /vicepa/AFSIDat/+/++++U/+/+/7++++sI.
> 09/12/2006 20:27:15 Found 0 link count file /vicepa/AFSIDat/+/++++U/+/+/9++++wI.
> 09/12/2006 20:27:15 Found 0 link count file /vicepa/AFSIDat/+/++++U/+/+/B+++++J.
> 09/12/2006 20:27:15 Found 0 link count file /vicepa/AFSIDat/+/++++U/+/+/D++++2J.
> 09/12/2006 20:27:15 Found 0 link count file /vicepa/AFSIDat/+/++++U/+/+/F++++6J.
> 09/12/2006 20:27:15 Found 0 link count file /vicepa/AFSIDat/+/++++U/+/+/0++++AJ.
> 09/12/2006 20:27:15 Found 0 link count file /vicepa/AFSIDat/+/++++U/+/+/4++++EJ.
> ...
> 09/12/2006 20:27:15 Found 0 link count file /vicepa/AFSIDat/+/++++U/+/+/n5++6280.
> 09/12/2006 20:27:15 Found 0 link count file /vicepa/AFSIDat/+/++++U/+/+/3+++6kC4.
> 09/12/2006 20:27:15 Found 0 link count file /vicepa/AFSIDat/+/++++U/+/+/=+++A2.
> 09/12/2006 20:27:16 1 nVolumesInInodeFile 28 
> 09/12/2006 20:27:16 SALVAGING VOLUME 536870912.
> 09/12/2006 20:27:16 root.cell (536870912) updated 09/12/2006 20:18
> 09/12/2006 20:27:16 totalInodes 9646
> 09/12/2006 20:27:16 iinc failed. inode 2632814952956 errno 9
> 09/12/2006 20:27:16 iinc failed. inode 9552007266813 errno 9
> ...
> 09/12/2006 20:27:17 iinc failed. inode 102744207679954 errno 9
> 09/12/2006 20:27:17 iinc failed. inode 102748502647252 errno 9
> 09/12/2006 20:27:17 iinc failed. inode 102752797614550 errno 9
> 09/12/2006 20:27:17 iinc failed. inode 102757092581848 errno 9
> 09/12/2006 20:27:17 iinc failed. inode 102761387549146 errno 9
> 09/12/2006 20:27:17 dir vnode 499: ??/atmelx (vnode 367): unique changed from 6884 to 0 -- deleted
>   
I had a problem moving some volumes (posted earlier today).  This is the 
exact type of errors I'm getting in the salvage log for a users home 
directory volume.  I've spent the last 6 hours trying to recover it to 
no avail.  I would suggest a couple of things.  First do a vos dump so 
you can at least get back to the same state you're in.  Second, don't 
continue to run salvager - it only seemed to get my volume in a worse 
state to the point where it's now saying:

vol-is-afs:/usr/afs/logs# tail -f SalvageLog
@(#) OpenAFS 1.4.1 built  2006-09-12
09/13/2006 02:53:46 STARTING AFS SALVAGER 2.4 (/usr/afs/bin/salvager 
/vicepa 536870984 -orphans attach)
09/13/2006 02:53:47 1 nVolumesInInodeFile 28
09/13/2006 02:53:47 SALVAGING VOLUME 536870984.
09/13/2006 02:53:47 home.user (536870984) updated 09/13/2006 01:08
09/13/2006 02:53:47 totalInodes 57696
09/13/2006 02:53:48 Cannot attach orphaned files and directories: Root 
directory not found
09/13/2006 02:53:48 Found 57692 orphaned files and directories (approx. 
1770169 KB)
09/13/2006 02:53:48 Salvaged home.user (536870984): 57692 files, 1770169 
blocks

Doing some searching on the Root directory not found error yields 
non-promising results.  Any help recovering volumes from either state 
would be appreciated by myself as well.

John