[OpenAFS] Problems with AFS native backups

Mike Polek mike@pictage.com
Fri, 26 Aug 2005 15:34:09 -0700


Hi, all,
   I need some help regarding the AFS native backup system.
I was having a problem the other day where my backups took
more tape than I thought they would, and there wasn't a
next tape the library. So, I added a couple tapes and
figured it would pick up where it left off.
   I have a program that figures out which tape to load
by looking at the backup database. So it loaded the last
tape into the drive (which of course was full), but instead
of realizing it was full and requesting the next tape, it
said "Can't position to end of dump on tape ST000012".

backup command looks like:

/usr/sbin/backup dump -volumeset storage -dump /month1/week3/Mon/Tue/Wed/Thu 
-append -portoffset 0 -cell p

---- excerpt from tape log file ----

Thu Aug 25 04:04:46 2005: Task 1: Prompt for tape storage.Thu.1 (1124967886)
Thu Aug 25 04:05:43 2005: Task 1: Proceeding with tape operation
Thu Aug 25 04:09:23 2005: Task 1: Can't append: Can't position to end of 
dump on tape ST000012
      butm: unexpected tape datablock
Thu Aug 25 04:09:23 2005: Task 1: Warning: Can't close tape
      butm: error during tape close: Invalid argument
Thu Aug 25 04:10:23 2005: Task 1: Prompt for tape storage.Thu.1 (1124967886)
Thu Aug 25 04:10:24 2005: Task 1: storage.Thu: Aborted by request. 0 volumes 
dumped, 61 failed

-----------------------------------



   I assumed (probably incorrectly) that if I just zapped
the last partial file on the tape that things would work out.
I was wrong. Same issue. So I figured I'd just wipe the last
dump and have it start from there... BUT.. apparently you're
not allowed to delete a dump unless you delete the whole
thing starting from the initial. So I figured... ok...
I'll just run scantape on everything.... BUT.....
Apparently asking scantape to scan a series of tapes
doesn't work. I complains that the dumpid doesn't match.
Well.. of course it doesn't. Later tapes had later dumps.
Apparently it doesn't notice that the number it's looking
for is in the parentID slot?

So, now that I've really messed up my dump database,
does anyone have any experience in this area for how to
recover?

The butc trace for the backup scantape operation
--> backup scantape -dbadd -port 1 -c p
for the first tape and a half or so appears
below. I apologize for the length of the post.
I deleted most of the uninteresting stuff while trying
to keep enough so that a kind reader could make sense of it.
Let me know if more information would be helpful.

Thanks in advance.
Mike Polek
Pictage, Inc.


------- butc screen output -----

ScanTape and add to the database
Prompt for tape
Thanks, now proceeding with tape scanning operation.
Proceeding with tape operation
Dump label
----------
permanent tape name = ST000000 (1123239901)
AFS tape name = storage.month1.1 (1123239901)
creationTime = Fri Aug  5 04:06:11 2005
expirationDate = Sun Sep 25 04:06:11 2005
cell = p
size = 104857600 Kbytes
dump path = /month1
dump id = 1123239901
useCount = 18
-- End of dump label --

-- volume --
volume name: storage.KTDocuments.backup
volume ID 537880079
dumpSetName: storage.month1
dumpID 1123239901
level 0
parentID 0
endTime 0
clonedate Fri Aug  5 04:04:27 2005

-- volume --
volume name: storage.tardumps.backup
volume ID 537816609
dumpSetName: storage.month1
dumpID 1123239901
level 0
parentID 0
endTime 0
clonedate Fri Aug  5 04:04:26 2005

[delete uninteresting stuff]

-- volume --
volume name: st.ds.pw.OOBShipping.backup
volume ID 537407178
dumpSetName: storage.month1
dumpID 1123239901
level 0
parentID 0
endTime 0
clonedate Fri Aug  5 04:03:52 2005

-- volume --
volume name: st.ds.pw.FinalPrints.backup
volume ID 537407175
dumpSetName: storage.month1
dumpID 1123239901
level 0
parentID 0
endTime 0
clonedate Fri Aug  5 04:03:51 2005

butm: Tape close failed. Error 22
Warning: Can't close tape
      butm: error during tape close: Invalid argument
Unloading Data Transfer Element into Storage Element 82...done
Prompt for tape storage.month1.2 (1123239901)
Thanks, now proceeding with tape scanning operation.
Proceeding with tape operation
-- volume --
volume name: storage.KTDocuments.backup
volume ID 537880079
dumpSetName: storage.week2
dumpID 1123412683
level 1
parentID 1123239901
endTime 0
clonedate Sun Aug  7 04:04:28 2005

Warning: volume st.ds.pw.FinalPrints.backup (537407175) ignored. Incomplete 
- no last fragment
Warning: volume storage.KTDocuments.backup (537880079) ignored. Incomplete - 
no first fragment
-- volume --
volume name: storage.tardumps.backup
volume ID 537816609
dumpSetName: storage.week2
dumpID 1123412683
level 1
parentID 1123239901
endTime 0
clonedate Sun Aug  7 04:04:28 2005

Warning: volume storage.tardumps.backup (537816609) ignored. Expected DumpId 
1123239901, got 1123412683
-- volume --
volume name: storage.dbdumps.backup
volume ID 537812111
dumpSetName: storage.week2
dumpID 1123412683
level 1
parentID 1123239901
endTime 0
clonedate Sun Aug  7 04:04:27 2005

Warning: volume storage.dbdumps.backup (537812111) ignored. Expected DumpId 
1123239901, got 1123412683
-- volume --
volume name: storage.sysreports.backup
volume ID 537638199
dumpSetName: storage.week2
dumpID 1123412683
level 1
parentID 1123239901
endTime 0
clonedate Sun Aug  7 04:04:25 2005