[OpenAFS-devel] read performance 1.5.74 Linux

Hartmut Reuter reuter@rzg.mpg.de
Fri, 14 May 2010 11:25:18 +0200


To be sure it's not one of my modifications I built today
a client from the original 1.5.74 source.

The 1st thing that happens with memory cache is an oops because
afs_linux_storeproc doesn't handle memcache corectly. But fixed that (using
afs_GenericStoreProc instead for memcache) the performance looks as bad as with
the 1.5.74-osd client:

/afs/.notebook/u/hwr: write_test 100mb 0 100000000
write of 100000000 bytes took 1.434 sec.
close took 0.126 sec.
Total data rate = 62593 Kbytes/sec. for write
/afs/.notebook/u/hwr: fs flush 100mb
/afs/.notebook/u/hwr: read_test 100mb
open of 100mb bytes took 0.000 sec.
read of 100000000 bytes took 327.955 sec.
Total data rate = 298 Kbytes/sec. for read
/afs/.notebook/u/hwr:

Then I applied patch I1a11798f to rx.c and built 1.5.74-osd again.
There is a little improvement, but it's still very slow:

~/a/n: read_test 100mb
open of 100mb bytes took 0.067 sec.
read of 100000000 bytes took 29.154 sec.
Total data rate = 3342 Kbytes/sec. for read
~/a/n:

I think before giving this source to the users as 1.6 this problem has to be
understood and solved!

Hartmut

Jeffrey Altman schrieb:
> On 5/13/2010 1:34 PM, Steve Simmons wrote:
>>> The first thing that comes to mind is that we found the rx library was
>>> dropping packets on the floor under high load.  This was fixed in the
>>> 1.5.74 release but has not been fixed in the 1.4.12 release.
>> Any feel for the net effect on throughput? Is the improvement significant enough to consider backporting to 1.4.13?
> 
> Its a serious bug.  It causes the Windows client stress test to have an
> average SMB request RTT of 51 seconds instead of under 5 seconds due to
> repeated RPC failures.  The problem is that the Rx when under load drops
> packets on the floor by clearing the transmit queue before the packets
> have been sent and acknowledged.  Since the queue is empty, the sender
> things there is no more work to do and will never retry.  The receiver
> can only timeout.  Regardless of which side of the connection the
> packets are dropped the RPC issuer must perform a retry of the request.
> 
> These fixes do need to be back ported to 1.4.  It has not been done yet.
> Before a 1.4.13 it will be done.
> 
>>> The 1.5.74 client should be able to put more load on the file server
>>> as it does a much better job of efficiently reading from the page
>>> cache.
>> I don't understand the connection here. Hartmut seems to be saying that a 1.4.12osd client reading from a 1.4.12osd server runs 30x faster than a 1.5.74osd client reading from a 1.4.12osd server. Improvements in 1.5.74 client-side reading of the page cache shouldn't have any affect - presumably one is only reading the file because it isn't in page or disk cache. 30-fold is a pretty big number, too.
> 
> The connection is this.  If the client is able to issue more RPCs to the
> server in a shorter period of time, that puts stress on Rx and the
> likelihood of packets being discarded increases.  As soon as packets
> are discarded, the transfer rate hits the floor.
> 
> There are additional bottlenecks in Rx that have recently been removed
> as well.  1.5.74 no longer contains a bottleneck that prevented calls
> from ending while a request to allocate a new call was in flight.
> Allocations of new calls can block on outstanding packet processing so
> this reduced the ability of Rx to process calls in parallel.  That in
> turn prevented multiple threads in the calling application to process
> multiple calls in parallel effectively.
> 
> Another problem that has been fixed on master. The rx_rpc_stats mutex
> is supposed to be a fined grained lock.  Unfortunately, the way it was
> used the mutex would halt all rx processing whenever a ReapConnections
> operation was performed.
> 
> At some point of course, backporting the changes gets silly.  We need to
> cut the 1.6 branch and begin to get off the 1.4 series.  Its been nearly
> five years since 1.4.0 was released.
> 
> Jeffrey Altman
> 


-- 
-----------------------------------------------------------------
Hartmut Reuter                  e-mail 		reuter@rzg.mpg.de
			   	phone 		 +49-89-3299-1328
			   	fax   		 +49-89-3299-1301
RZG (Rechenzentrum Garching)   	web    http://www.rzg.mpg.de/~hwr
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------