[OpenAFS] nightly failure since upgrading to 1.6.5

Tracy Di Marco White gendalia@gmail.com
Mon, 10 Feb 2014 00:27:59 -0600


--001a11c1640ae5426604f2077045
Content-Type: text/plain; charset=ISO-8859-1

Every night at midnight, we run 'vos backupsys'. For three nights in a row,
on one of the servers I've upgraded to 1.6.5 and dafs, I've been getting
the following errors, and it mostly stops being a fileserver. Is this fixed
in 1.6.6? Anyone else seeing it? This is on NetBSD 6.1.3.

Thanks,
Tracy

Feb 8:
FileLog
Sat Feb  8 00:02:42 2014 fssync: breaking all call backs for volume
537054876
Sat Feb  8 00:02:42 2014 SYNC_getCom:  error receiving command
Sat Feb  8 00:02:42 2014 FSYNC_com:  read failed; dropping connection
(cnt=1372738)
Sat Feb  8 00:02:42 2014 _VLockFd: conflicting lock held on fd 222, offset
537011871 by pid 21378 (locktype=1)
Sat Feb  8 00:02:42 2014 VAttachVolume: another program has vol 537011871
locked
Sat Feb  8 00:02:42 2014 fssync: breaking all call backs for volume
537011873
Sat Feb  8 00:02:42 2014 VPreattachVolumeByVp_r: volume 537011871 not in
quiescent state (state 2 flags 0x18)
Sat Feb  8 00:05:57 2014 CB: ProbeUuid for host B1EB0B00 (
173.30.18.151:11887) failed -1

VolserLog
Sat Feb  8 00:02:42 2014 SYNC_ask:  length field in response inconsistent
on circuit 'FSSYNC'
Sat Feb  8 00:02:42 2014 SYNC_ask: protocol communications failure on
circuit 'FSSYNC'; attempting reconnect to server

Feb 9:
FileLog
Sun Feb  9 00:00:03 2014 SYNC_getCom:  error receiving command
Sun Feb  9 00:00:03 2014 FSYNC_com:  read failed; dropping connection
(cnt=493489)
Sun Feb  9 00:00:03 2014 _VLockFd: conflicting lock held on fd 225, offset
538046785 by pid 4129 (locktype=1)
Sun Feb  9 00:00:03 2014 VAttachVolume: another program has vol 538046785
locked
Sun Feb  9 00:00:03 2014 VPreattachVolumeByVp_r: volume 538046785 not in
quiescent state (state 2 flags 0x18)

VolserLog
Sun Feb  9 00:00:03 2014 1 Volser: Clone: Recloning volume 538046785 to
volume 538046787
Sun Feb  9 00:00:03 2014 SYNC_ask:  length field in response inconsistent
on circuit 'FSSYNC'
Sun Feb  9 00:00:03 2014 SYNC_ask: protocol communications failure on
circuit 'FSSYNC'; attempting reconnect to server

Feb 10:
FileLog
Mon Feb 10 00:00:21 2014 fssync: breaking all call backs for volume
538410173
Mon Feb 10 00:00:22 2014 SYNC_getCom:  error receiving command
Mon Feb 10 00:00:22 2014 FSYNC_com:  read failed; dropping connection
(cnt=542873)
Mon Feb 10 00:00:22 2014 _VLockFd: conflicting lock held on fd 40, offset
538316382 by pid 7155 (locktype=1)
Mon Feb 10 00:00:22 2014 VAttachVolume: another program has vol 538316382
locked
Mon Feb 10 00:00:22 2014 fssync: breaking all call backs for volume
538316384
Mon Feb 10 00:00:22 2014 VPreattachVolumeByVp_r: volume 538316382 not in
quiescent state (state 2 flags 0x18)

VolserLog
Mon Feb 10 00:00:21 2014 1 Volser: Clone: Recloning volume 538410171 to
volume 538410173
Mon Feb 10 00:00:22 2014 SYNC_ask:  length field in response inconsistent
on circuit 'FSSYNC'
Mon Feb 10 00:00:22 2014 SYNC_ask: protocol communications failure on
circuit 'FSSYNC'; attempting reconnect to server
Mon Feb 10 00:00:22 2014 1 Volser: Clone: Recloning volume 538316382 to
volume 538316384

--001a11c1640ae5426604f2077045
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Every night at midnight, we run &#39;vos backupsys&#39;.=
=A0For three nights in a row, on one of the servers I&#39;ve upgraded to 1.=
6.5 and dafs, I&#39;ve been getting the following errors, and it mostly sto=
ps being a fileserver. Is this fixed in 1.6.6? Anyone else seeing it? This =
is on NetBSD 6.1.3.<div>
<br></div><div>Thanks,</div><div>Tracy</div><div><br></div><div>Feb 8:<br><=
div><div style=3D"color:rgb(0,0,0);font-family:Helvetica;font-size:13px">Fi=
leLog</div><div style=3D"color:rgb(0,0,0);font-family:Helvetica;font-size:1=
3px">
Sat Feb=A0=A08 00:02:42 2014 fssync: breaking all call backs for volume 537=
054876</div><div style=3D"color:rgb(0,0,0);font-family:Helvetica;font-size:=
13px">Sat Feb=A0=A08 00:02:42 2014 SYNC_getCom:=A0=A0error receiving comman=
d</div><div style=3D"color:rgb(0,0,0);font-family:Helvetica;font-size:13px"=
>
Sat Feb=A0=A08 00:02:42 2014 FSYNC_com:=A0=A0read failed; dropping connecti=
on (cnt=3D1372738)</div><div style=3D"color:rgb(0,0,0);font-family:Helvetic=
a;font-size:13px">Sat Feb=A0=A08 00:02:42 2014 _VLockFd: conflicting lock h=
eld on fd 222, offset 537011871 by pid 21378 (locktype=3D1)</div>
<div style=3D"color:rgb(0,0,0);font-family:Helvetica;font-size:13px">Sat Fe=
b=A0=A08 00:02:42 2014 VAttachVolume: another program has vol 537011871 loc=
ked</div><div style=3D"color:rgb(0,0,0);font-family:Helvetica;font-size:13p=
x">Sat Feb=A0=A08 00:02:42 2014 fssync: breaking all call backs for volume =
537011873</div>
<div style=3D"color:rgb(0,0,0);font-family:Helvetica;font-size:13px">Sat Fe=
b=A0=A08 00:02:42 2014 VPreattachVolumeByVp_r: volume 537011871 not in quie=
scent state (state 2 flags 0x18)</div><div style=3D"color:rgb(0,0,0);font-f=
amily:Helvetica;font-size:13px">
Sat Feb=A0=A08 00:05:57 2014 CB: ProbeUuid for host B1EB0B00 (<a href=3D"ht=
tp://173.30.18.151:11887">173.30.18.151:11887</a>) failed -1</div><div styl=
e=3D"color:rgb(0,0,0);font-family:Helvetica;font-size:13px"><br></div><div =
style=3D"color:rgb(0,0,0);font-family:Helvetica;font-size:13px">
VolserLog</div><div style=3D"color:rgb(0,0,0);font-family:Helvetica;font-si=
ze:13px">Sat Feb=A0=A08 00:02:42 2014 SYNC_ask:=A0=A0length field in respon=
se inconsistent on circuit &#39;FSSYNC&#39;</div><div style=3D"color:rgb(0,=
0,0);font-family:Helvetica;font-size:13px">
Sat Feb=A0=A08 00:02:42 2014 SYNC_ask: protocol communications failure on c=
ircuit &#39;FSSYNC&#39;; attempting reconnect to server</div><div style=3D"=
color:rgb(0,0,0);font-family:Helvetica;font-size:13px"><div><br></div><div>=
Feb 9:</div>
<div>FileLog</div><div><div>Sun Feb =A09 00:00:03 2014 SYNC_getCom: =A0erro=
r receiving command</div><div>Sun Feb =A09 00:00:03 2014 FSYNC_com: =A0read=
 failed; dropping connection (cnt=3D493489)</div><div>Sun Feb =A09 00:00:03=
 2014 _VLockFd: conflicting lock held on fd 225, offset 538046785 by pid 41=
29 (locktype=3D1)</div>
<div>Sun Feb =A09 00:00:03 2014 VAttachVolume: another program has vol 5380=
46785 locked</div><div>Sun Feb =A09 00:00:03 2014 VPreattachVolumeByVp_r: v=
olume 538046785 not in quiescent state (state 2 flags 0x18)</div></div><div=
>
<br></div><div>VolserLog</div><div><div>Sun Feb =A09 00:00:03 2014 1 Volser=
: Clone: Recloning volume 538046785 to volume 538046787</div><div>Sun Feb =
=A09 00:00:03 2014 SYNC_ask: =A0length field in response inconsistent on ci=
rcuit &#39;FSSYNC&#39;</div>
<div>Sun Feb =A09 00:00:03 2014 SYNC_ask: protocol communications failure o=
n circuit &#39;FSSYNC&#39;; attempting reconnect to server</div></div><div>=
<br></div><div>Feb 10:</div><div>FileLog</div><div><div><div>Mon Feb 10 00:=
00:21 2014 fssync: breaking all call backs for volume 538410173</div>
<div>Mon Feb 10 00:00:22 2014 SYNC_getCom: =A0error receiving command</div>=
<div>Mon Feb 10 00:00:22 2014 FSYNC_com: =A0read failed; dropping connectio=
n (cnt=3D542873)</div><div>Mon Feb 10 00:00:22 2014 _VLockFd: conflicting l=
ock held on fd 40, offset 538316382 by pid 7155 (locktype=3D1)</div>
<div>Mon Feb 10 00:00:22 2014 VAttachVolume: another program has vol 538316=
382 locked</div><div>Mon Feb 10 00:00:22 2014 fssync: breaking all call bac=
ks for volume 538316384</div><div>Mon Feb 10 00:00:22 2014 VPreattachVolume=
ByVp_r: volume 538316382 not in quiescent state (state 2 flags 0x18)</div>
</div></div><div><br></div><div>VolserLog</div><div><div>Mon Feb 10 00:00:2=
1 2014 1 Volser: Clone: Recloning volume 538410171 to volume 538410173</div=
><div>Mon Feb 10 00:00:22 2014 SYNC_ask: =A0length field in response incons=
istent on circuit &#39;FSSYNC&#39;</div>
<div>Mon Feb 10 00:00:22 2014 SYNC_ask: protocol communications failure on =
circuit &#39;FSSYNC&#39;; attempting reconnect to server</div><div>Mon Feb =
10 00:00:22 2014 1 Volser: Clone: Recloning volume 538316382 to volume 5383=
16384</div>
</div><div><br></div></div></div></div></div>

--001a11c1640ae5426604f2077045--