[OpenAFS] Salvageserver 1.6.1-3+deb7u1 core dump
chas williams - CONTRACTOR
chas@cmf.nrl.navy.mil
Tue, 17 Jun 2014 11:15:26 -0400
On Tue, 17 Jun 2014 15:01:42 +0200 (CEST)
Harald Barth <haba@kth.se> wrote:
>
>
> Well, I did add a patch like:
>
>
> Index: openafs-1.6.9/src/vol/vol-salvage.c
> ===================================================================
> --- openafs-1.6.9.orig/src/vol/vol-salvage.c 2014-06-12 08:30:48.000000000 +0000
> +++ openafs-1.6.9/src/vol/vol-salvage.c 2014-06-17 10:34:23.857444175 +0000
> @@ -4124,7 +4124,8 @@
> &salvinfo->VolumeChanged);
> pa.Vnode = LFVnode;
> pa.Unique = LFUnique;
> - osi_Assert(Delete(&dh, "..") == 0);
> + if(Delete(&dh, "..") != 0)
> + Log("Delete of .. failed, but will try to recreate it anyway\n");
> osi_Assert(Create(&dh, "..", &pa) == 0);
>
> /* The original parent's link count was decremented above.
>
>
> Which created two empty __ORPHANDIR__* in the volume.
You did get two 'Delete of .. failed' right?
> Then I have the following logs from my backup script which tried a vos backup home.katy
After your new salvager created the orphan directories?
> Tue Jun 17 01:39:36 2014 beef.stacken.kth.se : Could not start a transaction on the volume 536904474
> Tue Jun 17 01:39:36 2014 beef.stacken.kth.se : Volume needs to be salvaged
> Tue Jun 17 01:39:36 2014 beef.stacken.kth.se : Error in vos backup command.
> Tue Jun 17 01:39:36 2014 beef.stacken.kth.se : Volume needs to be salvaged
>
> However, my salvage log says, that the volume was salvaged OK:
>
> 06/16/2014 23:39:37 Salvaged home.katy (536904474): 23897 files, 732753 blocks
>
> and that the salvage ended 06/16/2014 23:57:23 which is several hours before.
Salvage again using the 1.6.1 salvager. There shouldn't be any broken
orphans anymore.
> When I did a vos backup home.katy recently, everything went good. What's going on here?
>
> Followup question: Should I now run a salvage over all volumes? How do
> I do that with as little impact as possible manually?
I wouldn't salvage anything unless you know there is something wrong
with the volumes.
To reduce impact, I would write a small script to salvage the volumes
one at a time.