[OpenAFS] Salvageserver 1.6.1-3+deb7u1 core dump

chas williams - CONTRACTOR chas@cmf.nrl.navy.mil
Tue, 17 Jun 2014 11:15:26 -0400


On Tue, 17 Jun 2014 15:01:42 +0200 (CEST)
Harald Barth <haba@kth.se> wrote:

> 
> 
> Well, I did add a patch like:
> 
> 
> Index: openafs-1.6.9/src/vol/vol-salvage.c
> ===================================================================
> --- openafs-1.6.9.orig/src/vol/vol-salvage.c    2014-06-12 08:30:48.000000000 +0000
> +++ openafs-1.6.9/src/vol/vol-salvage.c 2014-06-17 10:34:23.857444175 +0000
> @@ -4124,7 +4124,8 @@
>                                         &salvinfo->VolumeChanged);
>                     pa.Vnode = LFVnode;
>                     pa.Unique = LFUnique;
> -                   osi_Assert(Delete(&dh, "..") == 0);
> +                   if(Delete(&dh, "..") != 0)
> +                     Log("Delete of .. failed, but will try to recreate it anyway\n");
>                     osi_Assert(Create(&dh, "..", &pa) == 0);
>  
>                     /* The original parent's link count was decremented above.
> 
> 
> Which created two empty __ORPHANDIR__* in the volume.

You did get two 'Delete of .. failed' right?

> Then I have the following logs from my backup script which tried a vos backup home.katy

After your new salvager created the orphan directories?

> Tue Jun 17 01:39:36 2014 beef.stacken.kth.se : Could not start a transaction on the volume 536904474
> Tue Jun 17 01:39:36 2014 beef.stacken.kth.se : Volume needs to be salvaged
> Tue Jun 17 01:39:36 2014 beef.stacken.kth.se : Error in vos backup command.
> Tue Jun 17 01:39:36 2014 beef.stacken.kth.se : Volume needs to be salvaged
> 
> However, my salvage log says, that the volume was salvaged OK:
> 
> 06/16/2014 23:39:37 Salvaged home.katy (536904474): 23897 files, 732753 blocks
> 
> and that the salvage ended 06/16/2014 23:57:23 which is several hours before.

Salvage again using the 1.6.1 salvager.  There shouldn't be any broken
orphans anymore.
 
> When I did a vos backup home.katy recently, everything went good. What's going on here?
> 
> Followup question: Should I now run a salvage over all volumes? How do
> I do that with as little impact as possible manually?

I wouldn't salvage anything unless you know there is something wrong
with the volumes.

To reduce impact, I would write a small script to salvage the volumes
one at a time.