[OpenAFS] Re: CopyOnWrite failure, leading to volume salvage

Dameon Wagner dameon.wagner@it.ox.ac.uk
Thu, 27 Sep 2012 17:00:36 +0100


On Thu, Sep 27 2012 at 10:07:33 -0500, Andrew Deason scribbled
 in "[OpenAFS] Re: CopyOnWrite failure, leading to volume salvage":
> On Thu, 27 Sep 2012 10:45:45 +0100
> Dameon Wagner <dameon.wagner@it.ox.ac.uk> wrote:

Hi Andrew,

> > In the end, `bos salvage` with "-volume 536874907" fixed everything,
> > with no known loss or corruption of data.  For the record, SalvageLog
> > contained many lines (just over a 1000) like the following first
> > three:
> > 
> > #---8<-----------------------------------------------------------------
> > Vnode 16928: version < inode version; fixed (old status)
> > Vnode 23352: version < inode version; fixed (old status)
> > Vnode 53568: version < inode version; fixed (old status)
> > ... ending with
> > totalInodes 1690748
> > Salvaged vhost.a071 (536874907): 939204 files, 361363522 blocks
> > #---8<-----------------------------------------------------------------
> 
> Is there anything in there besides the 'version < inode version'
> messages? (Those ones are (almost always) harmless.)

Thanks for your comments, especially those about the harmless log
entries -- that always nice to hear.

There was nothing else in the SalvageLog except some (also harmless
looking) preamble:

#---8<-----------------------------------------------------------------
STARTING AFS SALVAGER 2.4 (/usr/lib/openafs/salvager /vicepa 536874907)
5 nVolumesInInodeFile 160 
Recreating link table for volume 536874907.
CHECKING CLONED VOLUME 536892026.
CHECKING CLONED VOLUME 536891895.
CHECKING CLONED VOLUME 536891436.
CHECKING CLONED VOLUME 536891016.
SALVAGING VOLUME 536874907.
vhost.a071 (536874907) updated 08/26/2012 16:36
Vnode 16928: version < inode version; fixed (old status)
#---8<-----------------------------------------------------------------

With the last line being the first line from the original snippet
above.

> Can you 'vos examine' the relevant volume? If you don't want to make
> public the information like volume/server names, you can obfuscate
> stuff; I'd just want to see what the server/partition layout looks like
> for that volume's sites.

#---8<-----------------------------------------------------------------
$ vos examine -id 536874907
vhost.a071                        536874907 RW  372246072 K  On-line
    $FILESERVER /vicepa 
    RWrite  536874907 ROnly  536892662 Backup          0 
    MaxQuota  524288000 K 
    Creation    Tue Sep  9 16:08:54 2008
    Copy        Wed Sep  1 09:16:37 2010
    Backup      Never
    Last Update Thu Sep 27 16:31:50 2012
    187791 accesses in the past day (i.e., vnode references)

    RWrite: 536874907 
    number of sites -> 1
       server $FILESERVER partition /vicepa RW Site
#---8<-----------------------------------------------------------------

I was a little surprised to see "Backup Never", as our backup systems
logs show a successful backup just this morning (and previous days
through the schedule too).  Let me know if any further information
would be helpful/useful.

Cheers.

Dameon.

-- 
><> ><> ><> ><> ><> ><> ooOoo <>< <>< <>< <>< <>< <><
Dameon Wagner, Systems Development and Support Team
IT Services, University of Oxford
><> ><> ><> ><> ><> ><> ooOoo <>< <>< <>< <>< <>< <><