[OpenAFS-devel] The "50 second fetch-data"-bug?

Niklas Edmundsson Niklas.Edmundsson@hpc2n.umu.se
Mon, 10 Oct 2005 14:15:01 +0200 (MEST)


Hi all!

I'm seeing the following:
* Client (1.4.0rc1@linux26amd64, 1.4.0rc6@AIX53, memcache) writes
   large file to server (1.4.0rc1@linux26amd64). The linux-box is the
   fileserver, and that client is thus running on the fileserver.
* Throughput is good, but then unexplainably stalls, and recovers. The
   longer the transfer progresses the stalls becomes more frequent, and
   recovery takes longer time.
* During the stall, the writing process on the client is showing 100%
   CPU usage.

If I use a rather recent CVS version on the AIX box I don't get the 
stalls, if this is due to sane defaults for chunksizes etc I don't 
know.

Does this seem like the same bug as the thread "50 second fetch-data" 
a few days ago?

I can probably provide all kinds of network traces, but since this 
happens when transfering a large file (4GB is my current test victim) 
a raw capture seems a bit unpractical. Any tips?

/Nikke
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se     |    nikke@hpc2n.umu.se
---------------------------------------------------------------------------
  Cats took thousands of years to semi-domesticate humans.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=