[OpenAFS-devel] The "50 second fetch-data"-bug?

sumit singh sumising@gmail.com
Thu, 13 Oct 2005 09:44:45 -0400


------=_Part_3839_30035099.1129211085443
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Is there a particular reason why the 50 sec is the default time out value
choosen for the rx connections in the fileserver?

Is not this value too high now a days, or at least should be configurable
during build?

any thoughts on this.


On 10/11/05, Niklas Edmundsson <Niklas.Edmundsson@hpc2n.umu.se> wrote:
>
> On Mon, 10 Oct 2005, Jim Rees wrote:
>
> > Does this seem like the same bug as the thread "50 second fetch-data"
> > a few days ago?
> >
> > I don't think so. The 100% cpu usage on the client indicates something
> > else, maybe an rx bug. A tcpdump around the time of your stall might be
> > useful.
>
> In /afs/hpc2n.umu.se/home/n/nikke/Public/tmp/afs-stall:
> afsprob.cap4 : Capture written by tcpdump -s 1500
> afsprob.cap4.txt : Start/end-timestamps of stall and other misc info.
>
> An interesting observation is that the chunksize indeed matters, I get
> identical behaviour with the CVS version if I use the same chunksize
> (8k) as 1.4.0RC does by default. With the new default (64k for 128MB
> memcache) the stalls are less frequent and not as long-lived, but they
> do still occur.
>
> This capture is from my AIX SMP machine, the Linux UP machine freezes
> up completely during the stalls so the capture is no good.
>
> If information is missing or doesn't make sense, just poke at me and
> I'll see what I can do :).
>
> /Nikke
> --
>
> -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=
=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D=
-
> Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | nikke@hpc2n.umu.se
>
> -------------------------------------------------------------------------=
--
> I didn't do it nobody saw me you can't prove anything
>
> =3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=
=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D=
-=3D
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel
>



--
thanks and regards,
sumit singh

------=_Part_3839_30035099.1129211085443
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Is there a particular reason why the 50 sec is the default time out value<b=
r>

choosen for the rx connections in the fileserver?<br>

<br>

Is not this value too high now a days, or at least should be configurable d=
uring build? <br>
<br>
any thoughts on this.<br>
<br><br><div><span class=3D"gmail_quote">On 10/11/05, <b class=3D"gmail_sen=
dername">Niklas Edmundsson</b> &lt;<a href=3D"mailto:Niklas.Edmundsson@hpc2=
n.umu.se">Niklas.Edmundsson@hpc2n.umu.se</a>&gt; wrote:</span><blockquote c=
lass=3D"gmail_quote" style=3D"border-left: 1px solid rgb(204, 204, 204); ma=
rgin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
On Mon, 10 Oct 2005, Jim Rees wrote:<br><br>&gt;&nbsp;&nbsp;Does this seem =
like the same bug as the thread &quot;50 second fetch-data&quot;<br>&gt;&nb=
sp;&nbsp;a few days ago?<br>&gt;<br>&gt; I don't think so.&nbsp;&nbsp;The 1=
00% cpu usage on the client indicates something
<br>&gt; else, maybe an rx bug.&nbsp;&nbsp;A tcpdump around the time of you=
r stall might be<br>&gt; useful.<br><br>In /afs/hpc2n.umu.se/home/n/nikke/P=
ublic/tmp/afs-stall:<br>afsprob.cap4 : Capture written by tcpdump -s 1500<b=
r>afsprob.cap4.txt
 : Start/end-timestamps of stall and other misc info.<br><br>An interesting=
 observation is that the chunksize indeed matters, I get<br>identical behav=
iour with the CVS version if I use the same chunksize<br>(8k) as 1.4.0RC
 does by default. With the new default (64k for 128MB<br>memcache) the stal=
ls are less frequent and not as long-lived, but they<br>do still occur.<br>=
<br>This capture is from my AIX SMP machine, the Linux UP machine freezes
<br>up completely during the stalls so the capture is no good.<br><br>If in=
formation is missing or doesn't make sense, just poke at me and<br>I'll see=
 what I can do :).<br><br>/Nikke<br>--<br>-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=
=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D=
-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-
<br>&nbsp;&nbsp;Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se&nbsp;&nbsp;&n=
bsp;&nbsp; |&nbsp;&nbsp;&nbsp;&nbsp;<a href=3D"mailto:nikke@hpc2n.umu.se">n=
ikke@hpc2n.umu.se</a><br>--------------------------------------------------=
-------------------------<br>&nbsp;&nbsp;I didn't do it nobody saw me you c=
an't prove anything
<br>=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D=
-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=
=3D-=3D<br>_______________________________________________<br>OpenAFS-devel=
 mailing list<br><a href=3D"mailto:OpenAFS-devel@openafs.org">OpenAFS-devel=
@openafs.org
</a><br><a href=3D"https://lists.openafs.org/mailman/listinfo/openafs-devel=
">https://lists.openafs.org/mailman/listinfo/openafs-devel</a><br></blockqu=
ote></div><br><br clear=3D"all"><br>-- <br>thanks and regards,<br>sumit sin=
gh

------=_Part_3839_30035099.1129211085443--