[OpenAFS] Re: nightly failure since upgrading to 1.6.5

Tracy Di Marco White gendalia@gmail.com
Thu, 13 Feb 2014 00:13:41 -0600


--001a11c1640a45e57604f2439722
Content-Type: text/plain; charset=ISO-8859-1

On Mon, Feb 10, 2014 at 2:23 PM, Andrew Deason <adeason@sinenomine.net>
wrote:
>
> On Mon, 10 Feb 2014 00:27:59 -0600
> Tracy Di Marco White <gendalia@gmail.com> wrote:
>
> > VolserLog
> > Sat Feb  8 00:02:42 2014 SYNC_ask:  length field in response
inconsistent
> > on circuit 'FSSYNC'
> > Sat Feb  8 00:02:42 2014 SYNC_ask: protocol communications failure on
> > circuit 'FSSYNC'; attempting reconnect to server
>
> This message says what one of the problems is, but isn't providing a lot
> of information. If it's convenient for you to apply a patch and rebuild,
> the following patch would give us a little more information in this
> situation (from gerrit 10829):
>
> <
http://git.openafs.org/?p=openafs.git;a=patch;h=9604a45e94ed23a2941d0a7e11bfd892a0bd0bf7
>

VolserLog (yesteday)
Wed Feb 12 01:04:48 2014 SYNC_ask:  length field in response inconsistent
on circuit 'FSSYNC' command 65543, 200 != 292
Wed Feb 12 01:04:48 2014 SYNC_ask: protocol communications failure on
circuit 'FSSYNC'; attempting reconnect to server
Wed Feb 12 01:04:48 2014 SYNC_ask:  length field in response inconsistent
on circuit 'FSSYNC' command 65543, 200 != 292
Wed Feb 12 01:04:48 2014 SYNC_ask: protocol communications failure on
circuit 'FSSYNC'; attempting reconnect to server
Wed Feb 12 01:04:49 2014 SYNC_ask:  length field in response inconsistent
on circuit 'FSSYNC' command 65543, 200 != 292
Wed Feb 12 01:04:49 2014 SYNC_ask: protocol communications failure on
circuit 'FSSYNC'; attempting reconnect to server
Wed Feb 12 01:04:49 2014 SYNC_ask:  length field in response inconsistent
on circuit 'FSSYNC' command 65543, 200 != 292
Wed Feb 12 01:04:49 2014 SYNC_ask: protocol communications failure on
circuit 'FSSYNC'; attempting reconnect to server
Wed Feb 12 01:04:52 2014 SYNC_ask:  length field in response inconsistent
on circuit 'FSSYNC' command 65543, 200 != 292
 (continued for a while)

FileLog (yesterday)
Wed Feb 12 01:04:48 2014 SYNC_getCom:  error receiving command
Wed Feb 12 01:04:48 2014 FSYNC_com:  read failed; dropping connection
(cnt=89505)
Wed Feb 12 01:04:48 2014 SYNC_getCom:  error receiving command
Wed Feb 12 01:04:48 2014 FSYNC_com:  read failed; dropping connection
(cnt=89537)
Wed Feb 12 01:04:49 2014 SYNC_getCom:  error receiving command
Wed Feb 12 01:04:49 2014 FSYNC_com:  read failed; dropping connection
(cnt=90013)
Wed Feb 12 01:04:49 2014 SYNC_getCom:  error receiving command
Wed Feb 12 01:04:49 2014 FSYNC_com:  read failed; dropping connection
(cnt=90459)
Wed Feb 12 01:04:52 2014 SYNC_getCom:  error receiving command
Wed Feb 12 01:04:52 2014 FSYNC_com:  read failed; dropping connection
(cnt=94010)
(continued for a while)

VolserLog (today)
Thu Feb 13 00:04:26 2014 SYNC_ask:  length field in response inconsistent
on circuit 'FSSYNC' command 65543, 200 != 292
Thu Feb 13 00:04:26 2014 SYNC_ask: protocol communications failure on
circuit 'FSSYNC'; attempting reconnect to server

FileLog (today)
Thu Feb 13 00:04:26 2014 SYNC_getCom:  error receiving command
Thu Feb 13 00:04:26 2014 FSYNC_com:  read failed; dropping connection
(cnt=923666)
Thu Feb 13 00:04:26 2014 _VLockFd: conflicting lock held on fd 29, offset
537170029 by pid 6070 (locktype=1)
Thu Feb 13 00:04:26 2014 VAttachVolume: another program has vol 537170029
locked
Thu Feb 13 00:04:29 2014 fssync: breaking all call backs for volume
537170031
Thu Feb 13 00:04:29 2014 VPreattachVolumeByVp_r: volume 537170029 not in
quiescent state (state 2 flags 0x18)

-Tracy

--001a11c1640a45e57604f2439722
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br>On Mon, Feb 10, 2014 at 2:23 PM, Andrew Deason &lt;<a =
href=3D"mailto:adeason@sinenomine.net">adeason@sinenomine.net</a>&gt; wrote=
:<br>&gt;<br>&gt; On Mon, 10 Feb 2014 00:27:59 -0600<br>&gt; Tracy Di Marco=
 White &lt;<a href=3D"mailto:gendalia@gmail.com">gendalia@gmail.com</a>&gt;=
 wrote:<br>
&gt;<br>&gt; &gt; VolserLog<br>&gt; &gt; Sat Feb =A08 00:02:42 2014 SYNC_as=
k: =A0length field in response inconsistent<br>&gt; &gt; on circuit &#39;FS=
SYNC&#39;<br>&gt; &gt; Sat Feb =A08 00:02:42 2014 SYNC_ask: protocol commun=
ications failure on<br>
&gt; &gt; circuit &#39;FSSYNC&#39;; attempting reconnect to server<br>&gt;<=
br>&gt; This message says what one of the problems is, but isn&#39;t provid=
ing a lot<br>&gt; of information. If it&#39;s convenient for you to apply a=
 patch and rebuild,<br>
&gt; the following patch would give us a little more information in this<br=
>&gt; situation (from gerrit 10829):<br>&gt;<br>&gt; &lt;<a href=3D"http://=
git.openafs.org/?p=3Dopenafs.git;a=3Dpatch;h=3D9604a45e94ed23a2941d0a7e11bf=
d892a0bd0bf7">http://git.openafs.org/?p=3Dopenafs.git;a=3Dpatch;h=3D9604a45=
e94ed23a2941d0a7e11bfd892a0bd0bf7</a>&gt;<br>
<br>VolserLog (yesteday)<br>Wed Feb 12 01:04:48 2014 SYNC_ask: =A0length fi=
eld in response inconsistent on circuit &#39;FSSYNC&#39; command 65543, 200=
 !=3D 292<br>Wed Feb 12 01:04:48 2014 SYNC_ask: protocol communications fai=
lure on circuit &#39;FSSYNC&#39;; attempting reconnect to server<br>
Wed Feb 12 01:04:48 2014 SYNC_ask: =A0length field in response inconsistent=
 on circuit &#39;FSSYNC&#39; command 65543, 200 !=3D 292<br>Wed Feb 12 01:0=
4:48 2014 SYNC_ask: protocol communications failure on circuit &#39;FSSYNC&=
#39;; attempting reconnect to server<br>
Wed Feb 12 01:04:49 2014 SYNC_ask: =A0length field in response inconsistent=
 on circuit &#39;FSSYNC&#39; command 65543, 200 !=3D 292<br>Wed Feb 12 01:0=
4:49 2014 SYNC_ask: protocol communications failure on circuit &#39;FSSYNC&=
#39;; attempting reconnect to server<br>
Wed Feb 12 01:04:49 2014 SYNC_ask: =A0length field in response inconsistent=
 on circuit &#39;FSSYNC&#39; command 65543, 200 !=3D 292<br>Wed Feb 12 01:0=
4:49 2014 SYNC_ask: protocol communications failure on circuit &#39;FSSYNC&=
#39;; attempting reconnect to server<br>
Wed Feb 12 01:04:52 2014 SYNC_ask: =A0length field in response inconsistent=
 on circuit &#39;FSSYNC&#39; command 65543, 200 !=3D 292<br>=A0(continued f=
or a while)<div><br></div><div>FileLog (yesterday)</div><div><div>Wed Feb 1=
2 01:04:48 2014 SYNC_getCom: =A0error receiving command</div>
<div>Wed Feb 12 01:04:48 2014 FSYNC_com: =A0read failed; dropping connectio=
n (cnt=3D89505)</div><div>Wed Feb 12 01:04:48 2014 SYNC_getCom: =A0error re=
ceiving command</div><div>Wed Feb 12 01:04:48 2014 FSYNC_com: =A0read faile=
d; dropping connection (cnt=3D89537)</div>
<div>Wed Feb 12 01:04:49 2014 SYNC_getCom: =A0error receiving command</div>=
<div>Wed Feb 12 01:04:49 2014 FSYNC_com: =A0read failed; dropping connectio=
n (cnt=3D90013)</div><div>Wed Feb 12 01:04:49 2014 SYNC_getCom: =A0error re=
ceiving command</div>
<div>Wed Feb 12 01:04:49 2014 FSYNC_com: =A0read failed; dropping connectio=
n (cnt=3D90459)</div><div>Wed Feb 12 01:04:52 2014 SYNC_getCom: =A0error re=
ceiving command</div><div>Wed Feb 12 01:04:52 2014 FSYNC_com: =A0read faile=
d; dropping connection (cnt=3D94010)</div>
<div>(continued for a while)</div><div><br></div>VolserLog (today)<br>Thu F=
eb 13 00:04:26 2014 SYNC_ask: =A0length field in response inconsistent on c=
ircuit &#39;FSSYNC&#39; command 65543, 200 !=3D 292<br>Thu Feb 13 00:04:26 =
2014 SYNC_ask: protocol communications failure on circuit &#39;FSSYNC&#39;;=
 attempting reconnect to server<br>
<br>FileLog (today)<br>Thu Feb 13 00:04:26 2014 SYNC_getCom: =A0error recei=
ving command<br>Thu Feb 13 00:04:26 2014 FSYNC_com: =A0read failed; droppin=
g connection (cnt=3D923666)<br>Thu Feb 13 00:04:26 2014 _VLockFd: conflicti=
ng lock held on fd 29, offset 537170029 by pid 6070 (locktype=3D1)<br>
Thu Feb 13 00:04:26 2014 VAttachVolume: another program has vol 537170029 l=
ocked<br>Thu Feb 13 00:04:29 2014 fssync: breaking all call backs for volum=
e 537170031<br>Thu Feb 13 00:04:29 2014 VPreattachVolumeByVp_r: volume 5371=
70029 not in quiescent state (state 2 flags 0x18)<br>
<br>-Tracy<br></div></div>

--001a11c1640a45e57604f2439722--