[OpenAFS] Weird AFS fileserver problem

Mr Budzynowski budzynowski@hotmail.com
Tue, 13 Feb 2007 13:59:02 +1100


--_9c019478-9989-4d90-ba29-384e4d02b462_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable


=20
On the off chance that this is helpful...
=20
A while back I was getting similar error messages that seemed to affect cer=
tain volumes at random times.
=20
I'm not sure if this is what fixed the problem, but I synchronised all the =
system clocks and haven't seen the errors since.
=20
Aleks Budzynowski
> I've got a problematic volume on one of my file servers. When I go to the=
> read-write version of the volume, I can see the data without a problem.> =
When I went to the read-only copy, however, I would see the following:> > s=
ebby:/usr/afsws/bin% ls> ./CONFIG: Error 112> > I did a vos remsite on the =
volumes, and when only the read-write was left,> I could see all the data.>=
 > I then did a vos listvol and saw that the .readonly volume was still the=
re.> I tried to zap it, and it gave an error. Now, when I try to do a "vos>=
 listvol" or "vos examine" command on any volumes on that server, I see thi=
s:> > sebby:~% vos listvol antenor.ctd.anl.gov a> Could not fetch the list =
of partitions from the server> Possible communication failure> Possible com=
munication failure> > sebby:~% vos exa sun4x_510.usr.DE142Nmi> Could not fe=
tch the information about volume 1818570311 from the server> Possible commu=
nication failure> Error in vos examine command.> Possible communication fai=
lure> > Dump only information from VLDB> > sun4x_510.usr.DE142Nmi > RWrite:=
 1818570311 Backup: 1818570313> number of sites -> 1> server antenor.ctd.an=
l.gov partition /vicepa RW Site > > > I plan to shut down the file server l=
ate tonight and doing a full fsck/salvage> on the partition (we're still ru=
nning the inode server). I just wondered if> anyone had seen this before, a=
nd if anyone had any suggestions/comments on> what could be causing this.
_________________________________________________________________
Get connected - Use your Hotmail address to sign into Windows Live Messenge=
r now.=20
http://get.live.com/messenger/overview=

--_9c019478-9989-4d90-ba29-384e4d02b462_
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<html>
<head>
<style>
P
{
margin:0px;
padding:0px
}
body
{
FONT-SIZE: 10pt;
FONT-FAMILY:Tahoma
}
</style>
</head>
<body>
&nbsp;<BR>
On the off chance that this is helpful...<BR>
&nbsp;<BR>
A while back I was getting similar error messages that seemed to affect cer=
tain volumes at random times.<BR>
&nbsp;<BR>
I'm not sure if this is what fixed the problem, but&nbsp;I synchronised&nbs=
p;all the&nbsp;system clocks and haven't seen the errors since.<BR>
&nbsp;<BR>
Aleks Budzynowski<BR>
<BR>&gt; I've got a problematic volume on one of my file servers. When I go=
 to the<BR>&gt; read-write version of the volume, I can see the data withou=
t a problem.<BR>&gt; When I went to the read-only copy, however, I would se=
e the following:<BR>&gt; <BR>&gt; sebby:/usr/afsws/bin% ls<BR>&gt; ./CONFIG=
: Error 112<BR>&gt; <BR>&gt; I did a vos remsite on the volumes, and when o=
nly the read-write was left,<BR>&gt; I could see all the data.<BR>&gt; <BR>=
&gt; I then did a vos listvol and saw that the .readonly volume was still t=
here.<BR>&gt; I tried to zap it, and it gave an error. Now, when I try to d=
o a "vos<BR>&gt; listvol" or "vos examine" command on any volumes on that s=
erver, I see this:<BR>&gt; <BR>&gt; sebby:~% vos listvol antenor.ctd.anl.go=
v a<BR>&gt; Could not fetch the list of partitions from the server<BR>&gt; =
Possible communication failure<BR>&gt; Possible communication failure<BR>&g=
t; <BR>&gt; sebby:~% vos exa sun4x_510.usr.DE142Nmi<BR>&gt; Could not fetch=
 the information about volume 1818570311 from the server<BR>&gt; Possible c=
ommunication failure<BR>&gt; Error in vos examine command.<BR>&gt; Possible=
 communication failure<BR>&gt; <BR>&gt; Dump only information from VLDB<BR>=
&gt; <BR>&gt; sun4x_510.usr.DE142Nmi <BR>&gt; RWrite: 1818570311 Backup: 18=
18570313<BR>&gt; number of sites -&gt; 1<BR>&gt; server antenor.ctd.anl.gov=
 partition /vicepa RW Site <BR>&gt; <BR>&gt; <BR>&gt; I plan to shut down t=
he file server late tonight and doing a full fsck/salvage<BR>&gt; on the pa=
rtition (we're still running the inode server). I just wondered if<BR>&gt; =
anyone had seen this before, and if anyone had any suggestions/comments on<=
BR>&gt; what could be causing this.<BR><BR><BR><br /><hr />Stay up-to-date =
with your friends through the Windows LiveT Spaces friends list.  <a href=
=3D'http://spaces.live.com/spacesapi.aspx?wx_action=3Dcreate&wx_url=3D/frie=
nds.aspx&mkt=3Den-us' target=3D'_new'>Check it out!</a></body>
</html>=

--_9c019478-9989-4d90-ba29-384e4d02b462_--