[OpenAFS] can't get volumes online
Jeffrey Hutzelman
jhutz@cmu.edu
Wed, 16 Feb 2005 19:19:59 -0500
On Friday, February 11, 2005 07:45:12 PM +0100 Stephan Wiesand
<Stephan.Wiesand@desy.de> wrote:
> For some 25 volumes, the salvager complained about problems with the
> header structure and renamed them to "bogus.<numeric ID> and left them
> offline:
>
> ...
> Salvaged bogus.536883946 (536883946): 449 files, 1000045 blocks
>
> We tried dumping and restoring those to different volumes: They're still
> offline. We tried running the salvager on the new volumes again, but
>
> STARTING AFS SALVAGER 2.4 (/usr/afs/bin/salvager -part /vicepd -volumeid
> 536883946 -showlog)
> SALVAGING VOLUME 536883946.
> xxx.yyy.zzz (536883946) not updated (created 02/11/2005 18:35)
> Salvaged xxx.yyy.zzz (536883946): 449 files, 1000045 blocks
>
> and the volume's still offline.
>
> Any ideas? Or do we have to assume that these volumes were corrupted to
> the point where recovery is completely impossible?
It would help if you identified the platform and AFS version you're using.
Note that quoting "STARTING AFS SALVAGER 2.4" does not help -- that version
string has said 2.4 at least since AFS 3.1, and still says the same thing
on the OpenAFS CVS head today.
When you say the volume is offline, I assume you are basing this on the
output you see in 'vos listvol' or 'vos examine'. One of the ways this can
happen is if there is another copy of the same volume (by ID) on a
lower-numbered partition on the same server. Have you checked that this
volume does not appear on /vicepa, /vicepb, or /vicepc? Is the volume
offline even when you restore it to a different server?
Just as an additional check, does that volume (by number) actually appear
in the VLDB? What output do you get from 'vos listvldb 536883946' ?
If the offline-ness survives a dump and restore to a different server, then
it is likely based on some persistent state which is recorded in a volume
dump. If this is the case, you may be able to get some useful information
by looking at a volume dump of one of these volumes.
Grab a copy of my volume dump tools from
/afs/cs.cmu.edu/project/systems-jhutz/dumpscan.
Do a dump of one of the offline volumes, and then run
afsdump_scan -PV <dump_file>
The output contains all of the volume-level information that is recorded in
the volume dump, none of which should be particularly sensitive. Send a
copy of that output (it's not very long), and perhaps someone can comment
on what's wrong.
-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
Sr. Research Systems Programmer
School of Computer Science - Research Computing Facility
Carnegie Mellon University - Pittsburgh, PA