[OpenAFS-devel] The "50 second fetch-data"-bug?

Niklas Edmundsson Niklas.Edmundsson@hpc2n.umu.se
Thu, 20 Oct 2005 15:13:44 +0200 (MEST)


On Tue, 11 Oct 2005, Niklas Edmundsson wrote:

>> I don't think so.  The 100% cpu usage on the client indicates something
>> else, maybe an rx bug.  A tcpdump around the time of your stall might be
>> useful.
>
> In /afs/hpc2n.umu.se/home/n/nikke/Public/tmp/afs-stall:
> afsprob.cap4 : Capture written by tcpdump -s 1500
> afsprob.cap4.txt : Start/end-timestamps of stall and other misc info.
>
> An interesting observation is that the chunksize indeed matters, I get 
> identical behaviour with the CVS version if I use the same chunksize (8k) as 
> 1.4.0RC does by default. With the new default (64k for 128MB memcache) the 
> stalls are less frequent and not as long-lived, but they do still occur.
>
> This capture is from my AIX SMP machine, the Linux UP machine freezes up 
> completely during the stalls so the capture is no good.
>
> If information is missing or doesn't make sense, just poke at me and I'll see 
> what I can do :).

Did anyone take a look at this? If there's information missing just 
tell me, I can easily reproduce this. Since it shows up on more than 
one arch I guess that there's something fundamental that's broken...

/Nikke
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se     |    nikke@hpc2n.umu.se
---------------------------------------------------------------------------
  Chickens are how eggs make more eggs.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=