[OpenAFS] odd problem with RW site after a botched replica

Timothy Balcer timothy@telmate.com
Mon, 29 Oct 2012 11:41:09 -0700


--047d7b418145564cae04cd37036e
Content-Type: text/plain; charset=ISO-8859-1

Hello all,

I have a volume that had a replica, which has now been removed with vos
remsite. I had made a mistake with the server directive originally, and I
attempted to correct the error midstream... ultimately, the RO volume
seemed to release. However, last night the RW volume went offline, as well
as the RO volume.

Now, this is a reproducible volume, so I -could- delete it and start over,
no problem.. but I would really like to know why, now that I have removed
the RO copy, the salvage operation will not bring the volume back online.
This is the third time I am salvaging the volume, the first two done due to
some information I read about doing double salvage in order to get rid of
bad RO volumes. That did happen, by the way, as you can see below. the *06
header file was the file for the RO site.

Here is the Salvagelog of the second salvage attempt:

10/29/2012 01:06:21 STARTING AFS SALVAGER 2.4 (/usr/lib/openafs/salvager
/vicepb 536870935)
10/29/2012 01:41:53 1 nVolumesInInodeFile 32
10/29/2012 01:42:05 SALVAGING VOLUME 536870935.
10/29/2012 01:42:05 user.snap (536870935) updated 10/27/2012 13:06
10/29/2012 01:43:44 totalInodes 25179364
10/29/2012 01:51:09 The volume header file /vicepb/V0536870936.vol is not
associated with any actual data (deleted)
10/29/2012 01:51:10 SYNC_ask: negative response on circuit 'FSSYNC'
10/29/2012 01:51:10 FSYNC_askfs: FSSYNC request denied for reason=101
10/29/2012 01:51:10 AskOnline:  file server denied online request to volume
536870935 partition /vicepb; trying again...
10/29/2012 01:51:10 SYNC_ask: negative response on circuit 'FSSYNC'
10/29/2012 01:51:10 FSYNC_askfs: FSSYNC request denied for reason=101
10/29/2012 01:51:10 AskOnline:  file server denied online request to volume
536870935 partition /vicepb; trying again...
10/29/2012 01:51:10 SYNC_ask: negative response on circuit 'FSSYNC'
10/29/2012 01:51:10 FSYNC_askfs: FSSYNC request denied for reason=101
10/29/2012 01:51:10 AskOnline:  file server denied online request to volume
536870935 partition /vicepb; trying again...

and here is the vos listvldb output:

user.snap
    RWrite: 536870935
    number of sites -> 1
       server afs-db.foo.com partition /vicepb RW Site  -- New release

The rest of the volumes have no "release" notation on them for the RW sites.

Any pointers?

-- 
Timothy Balcer / IT Services
Telmate / San Francisco, CA
Direct / (415) 300-4313
Customer Service / (800) 205-5510

--047d7b418145564cae04cd37036e
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hello all,<br><br>I have a volume that had a replica, which has now been re=
moved with vos remsite. I had made a mistake with the server directive orig=
inally, and I attempted to correct the error midstream... ultimately, the R=
O volume seemed to release. However, last night the RW volume went offline,=
 as well as the RO volume.<br>
<br>Now, this is a reproducible volume, so I -could- delete it and start ov=
er, no problem.. but I would really like to know why, now that I have remov=
ed the RO copy, the salvage operation will not bring the volume back online=
. This is the third time I am salvaging the volume, the first two done due =
to some information I read about doing double salvage in order to get rid o=
f bad RO volumes. That did happen, by the way, as you can see below. the *0=
6 header file was the file for the RO site.<br>
<br>Here is the Salvagelog of the second salvage attempt:<br><span style=3D=
"font-family:courier new,monospace"><br>10/29/2012 01:06:21 STARTING AFS SA=
LVAGER 2.4 (/usr/lib/openafs/salvager /vicepb 536870935)<br>10/29/2012 01:4=
1:53 1 nVolumesInInodeFile 32 <br>
10/29/2012 01:42:05 SALVAGING VOLUME 536870935.<br>10/29/2012 01:42:05 user=
.snap (536870935) updated 10/27/2012 13:06<br>10/29/2012 01:43:44 totalInod=
es 25179364<br>10/29/2012 01:51:09 The volume header file /vicepb/V05368709=
36.vol is not associated with any actual data (deleted)<br>
10/29/2012 01:51:10 SYNC_ask: negative response on circuit &#39;FSSYNC&#39;=
<br>10/29/2012 01:51:10 FSYNC_askfs: FSSYNC request denied for reason=3D101=
<br>10/29/2012 01:51:10 AskOnline:=A0 file server denied online request to =
volume 536870935 partition /vicepb; trying again...<br>
10/29/2012 01:51:10 SYNC_ask: negative response on circuit &#39;FSSYNC&#39;=
<br>10/29/2012 01:51:10 FSYNC_askfs: FSSYNC request denied for reason=3D101=
<br>10/29/2012 01:51:10 AskOnline:=A0 file server denied online request to =
volume 536870935 partition /vicepb; trying again...<br>
10/29/2012 01:51:10 SYNC_ask: negative response on circuit &#39;FSSYNC&#39;=
<br>10/29/2012 01:51:10 FSYNC_askfs: FSSYNC request denied for reason=3D101=
<br>10/29/2012 01:51:10 AskOnline:=A0 file server denied online request to =
volume 536870935 partition /vicepb; trying again...</span><br clear=3D"all"=
>
<br>and here is the vos listvldb output:<br><br><span style=3D"font-family:=
courier new,monospace">user.snap <br>=A0=A0=A0 RWrite: 536870935 <br>=A0=A0=
=A0 number of sites -&gt; 1<br>=A0=A0=A0=A0=A0=A0 server <a href=3D"http://=
afs-db.foo.com">afs-db.foo.com</a> partition /vicepb RW Site=A0 -- New rele=
ase</span><br>
<br>The rest of the volumes have no &quot;release&quot; notation on them fo=
r the RW sites.<br><br>Any pointers?<br><br>-- <br><span style=3D"border-co=
llapse:collapse;color:rgb(102,102,102);font-family:verdana,sans-serif;font-=
size:x-small">Timothy Balcer / IT Services<br>
Telmate / San Francisco, CA<br>Direct / </span><span style=3D"border-collap=
se:collapse;font-family:verdana,sans-serif;font-size:x-small"><font color=
=3D"#1155cc">(415) 300-4313</font><br><font color=3D"#666666">Customer Serv=
ice /=A0</font><a value=3D"+18002055510" style=3D"color:rgb(17,85,204)">(80=
0) 205-5510</a></span><br>


--047d7b418145564cae04cd37036e--