[OpenAFS] salvage removed .6M files!

rader@ginseng.hep.wisc.edu rader@ginseng.hep.wisc.edu
Fri, 29 Jul 2005 16:44:10 -0500


More information, fwiw... 

 - SalvageLog.old indicates (the initial) salvaging started 
   at 01:07:43 

 - BosLog indicates that that salvage exited with signal 15 at 
   05:00:38

 - SalvageLog indicates another salvage--the one that went 
   awry--started at 05:00:38 and completed 06:44:41

 - bos getrestart reports the server should restart for
   new binaries at "5:00 am"

It is possible the "restart for new binaries" erroneously happened,
and it kill -SIGTERM'ed the bos salvage which left the volume
in an inconsistent state that caused the subsequent salvage to
blow chunks??  (I'm under the general impression that interrupting
salvages is a bad idea.)

At any rate, I've turned off the "restarts for new binaries at
5:00 am" thing.

steve 
- - - 
systems & network manager
high energy physics
university of wisconsin

 > ---- Original Message ----
 > From: rader
 > 
 > One of our servers (Solaris7 inode fileserver running 1.2.11) lost
 > power this morning and the resulting bos salvage on a large (50 GB)
 > volume removed about 600,000 files....  /usr/afs/logs/SalvageLog
 > reads, for example...
 > 
 >  07/29/2005 06:19:26 dir vnode 87953: invalid entry: \
 >    ./cmsprod/cern/setup.sh (vnode 2258102, unique 14499243)
 >  07/29/2005 06:19:26 dir vnode 87953: ./cmsprod/cern/setup.sh \
 >    (vnode 2258102): unique changed from 14499243 to 0 -- deleted
 > 
 > Does anybody have any suggestions about how to recover the lost
 > files??  (I'm restoring from tape now, but I'll still have the
 > busted volume around when I'm done.)
 > 
 > steve 
 > - - - 
 > systems & network manager
 > high energy physics
 > university of wisconsin
 > 
 > _______________________________________________
 > OpenAFS-info mailing list
 > OpenAFS-info@openafs.org
 > https://lists.openafs.org/mailman/listinfo/openafs-info