[OpenAFS] Volume off-line after DB upgrade and VLDB lists wrong location
Staffan Hämälä
sh@ltu.se
Sun, 25 Nov 2012 08:10:14 +0100
After an upgrade of our DB servers to 1.6.1 (not sure if related to the
problem), and OS updates and reboots of all file servers yesterday, I
noticed that one volume has a strange error that I've not seen before.
The fileservers are all dafs 1.6.1 (upgraded a few months ago).
One volume appears off-line, and a vos exa doesn't even find any
information:
vos exa staff.xyz
Could not fetch the information about volume 537689471 from the server
: No such device
Volume does not exist on server afsfs1.its.ltu.se as indicated by the VLDB
Dump only information from VLDB
staff.xyz
RWrite: 537689471 Backup: 537689473
number of sites -> 1
server afsfs1.its.ltu.se partition /vicepc RW Site
I happened to have a record of the location of the volume, where I saw
that the volume used to be in partition a, not c.
On the file server, I've checked:
$ ls -l /vicepc/*537689471*
-rw-r--r-- 1 root root 76 21 okt 19.19 /vicepc/V0537689471.vol
$ ls -l /vicepa/*537689471*
-rw-r--r-- 1 root root 76 21 okt 16.26 /vicepa/V0537689471.vol
vos listvol afsfs1 a gives:
staff.xyz 537689471 RW 151580897 K Off-line
vos listvol afsfs1 c gives this error:
**** Could not attach volume 537689471 ****
vos listvldb staff.xyz -s afsfs1 -p a lists partition c:
staff.xyz
RWrite: 537689471 Backup: 537689473
number of sites -> 1
server afsfs1.its.ltu.se partition /vicepc RW Site
FileLog on the server has these errors:
Sat Nov 24 16:02:26 2012 Warning: Duplicate volume id 537689471 detected.
Sat Nov 24 16:34:06 2012 Volume 537689471 offline: not in service
(repeated lots of times)
What should we do about this? Will a vos remove of the volume on
partition c make it find the correct one, on partition a, instead?
/Staffan