[OpenAFS-devel] Patch for the salvager

Hartmut Reuter reuter@rzg.mpg.de
Thu, 17 Jul 2008 11:12:54 +0200


The problem Rainer Toebbicke described some time ago that directories 
had some dirt in the high order 32 bits of the file length is understood:

When running the salvager with -salvagedirs it copies all directories 
into new inodes and updates the vnode. To do that it used

VNDISK_SET_LEN(&vnode, Length(&newdir));

Length() returns only an int not afs_size_t. So at least on bigendian 
SUNs length_hi is filled with dirt.

I experiencd here also on Linux another "nice" problem: After salvage 
with -salvagedirs directories which had contained the famous .__afsXXX 
files where corrupted, actually had length 0. Then the next salvage 
would find many orphans...

The reason was that when these .__afsXXX files (or other oddities) 
appear the salvager does a CopyOnWrite copying once again the contents 
of the directory. On namei-Fileservers the pseudo-inode could have the 
same tag the one -salvagedirs had deleted before had had. Unfortunately 
the deleted file had still an open file descriptor, so the new copy went 
into the unlinked file while the newly created remained empty!

Additionally I also changed the define for FSYNC_SALVAGE from 1 (which 
means V_READONLY) to 4 which forces the volume off-line in the 
fileserver giving "busy" to the users during salvage.

This patch was created for the rxosd environment, so line number may be 
slightly different for normal openafs-1.4.7, but I think it schould be 
possible to find the right place...

Hartmut

--- vol-salvage.c       (revision 80)
+++ vol-salvage.c       (working copy)
@@ -1797,7 +1797,8 @@
                         || vsp->header.parent == singleVolumeNumber)) {
                     (void)afs_snprintf(nameShouldBe, sizeof nameShouldBe,
                                        VFORMAT, vsp->header.id);
-                   if (singleVolumeNumber)
+                   if (singleVolumeNumber
+                     && vsp->header.id != singleVolumeNumber)
                         AskOffline(vsp->header.id);
                     if (strcmp(nameShouldBe, dp->d_name)) {
                         if (!Showmode)
@@ -2874,10 +2875,12 @@
      struct VnodeClassInfo *vcp = &VnodeClassInfo[vLarge];
      Inode oldinode, newinode;
      DirHandle newdir;
+    FdHandle_t *fdP;
      afs_int32 code;
      afs_sfsize_t lcode;
      afs_int32 parentUnique = 1;
      struct VnodeEssence *vnodeEssence;
+    afs_size_t length;

      if (Testing)
         return;
@@ -2912,7 +2915,7 @@
                    vnode.uniquifier,
                    (vnode.parent ? vnode.parent : dir->vnodeNumber),
                    parentUnique);
-    if (code == 0)
+    if (code == 0)
         code = DFlush();
      if (code) {
         /* didn't really build the new directory properly, let's just 
give up. */
@@ -2933,7 +2936,8 @@
      }
      vnode.cloned = 0;
      VNDISK_SET_INO(&vnode, newinode);
-    VNDISK_SET_LEN(&vnode, Length(&newdir));
+    length = Length(&newdir);
+    VNDISK_SET_LEN(&vnode, length);
      lcode =
         IH_IWRITE(vnodeInfo[vLarge].handle,
                   vnodeIndexOffset(vcp, dir->vnodeNumber), (char *)&vnode,
@@ -2950,8 +2954,13 @@
  #else
      vnodeInfo[vLarge].handle->ih_synced = 1;
  #endif
+    /* make sure old directory file is really closed */
+    fdP = IH_OPEN(dir->dirHandle.dirh_handle);
+    FDH_REALLYCLOSE(fdP);
+
      code = IH_DEC(dir->ds_linkH, oldinode, dir->rwVid);
      assert(code == 0);
+
      dir->dirHandle = newdir;
  }

--- fssync.h    (revision 80)
+++ fssync.h    (working copy)
@@ -33,9 +33,10 @@
  /* Reasons (these could be communicated to venus or converted to 
messages) */

  #define FSYNC_WHATEVER         0       /* XXXX */
-#define FSYNC_SALVAGE          1       /* volume is being salvaged */
+#define FSYNC_READONLY         1       /* same as V_READONLY  */
  #define FSYNC_MOVE             2       /* volume is being moved */
  #define FSYNC_OPERATOR         3       /* operator forced volume 
offline */
+#define FSYNC_SALVAGE          4       /* volume is being salvaged */


  /* Replies (1 byte) */
-----------------------------------------------------------------
Hartmut Reuter                  e-mail 		reuter@rzg.mpg.de
			   	phone 		 +49-89-3299-1328
			   	fax   		 +49-89-3299-1301
RZG (Rechenzentrum Garching)   	web    http://www.rzg.mpg.de/~hwr
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------