[OpenAFS-devel] read performance 1.5.74 Linux

Hartmut Reuter reuter@rzg.mpg.de
Mon, 17 May 2010 17:21:26 +0200


Finally it works.

The following changes were necessary:

in afsconfog.h (or elsewhere  in a header included in rx)
#define RX_TRIMDATABUFS 1

This frees many of the 11 data buffers allocated in each rxi_readPacket in
rx_kcommon.c. The other changes are the patch proposed by Jaffrey Altman to
correct counting of rx_nPackets and the drastic increasement of AFS_NRXPACKETS
described below.

The mail below I had tried to send Jeffrey yesterday evening, but the mail
server at secure endpoints rejected it as spam.

Jeffrey Altman schrieb:
> The good news is that the actual packet counts are now being reported
> and that additional packets are being allocated but extremely slowly. 
> There is also evidence that rx_extraPackets is being ignored since the
> initial value is 256 in rx_globals.h and after the client starts there
> are only 147.

That's because in any case in afs_call.c it's overwritten either by the value
entered as -rxpck or bei the constant AFS_NRXPACKETS which used to be 80.
I drastically increased it to 1024.

> 
> Could you revert the change in rx_globals.h of rx_maxReceiveWindow and
> rx_maxSendWindow and set them back to 64 from 128?
> 
> Could you also change rx_CheckPackets() so that it allocates
> rx_maxSendWindow instead of rx_initSendWindow?

for debugging I had a printf in AllocPacketBufs which showed me rx_nFreePackets
and the number of packets allocated at a time. It were always 11 packets.
Before I had increased AFS_NRXPACKETS so much I had set it to 256. With this run
rx_nFreePackets was mostly around 380 but sometimes came down to 27 causing then
packet reclaims (I had also a printf in TooLow().

However, the problem is not solved: even with the high number of rx_nFreePackets
we get to TooLow several times even reading moderate sized files. Presently
rxdebug shows 1239/1251 free/total packets. That means someone allocates more
than 1000 packets and frees them later or they get freed by the reclaim.

Hartmut

> 
> I think the real problem is that rx_CheckPackets() is never called from
> a place where it is safe to allocate more packets.   I need to look into
> that when I get home later.
> 
> Now to volleyball ....
> 
> Jeffrey Altman
> 
> 
> 
> On 5/16/2010 4:27 AM, Hartmut Reuter wrote:
>> Jeffrey Altman schrieb:
>>> Please send new rxdebug output.
>> Directly after the client started
>>
>> /afs/notebook/.cs/openafs-1.5/@sys/src/rxdebug: ./rxdebug reuter 7001 -rxstat
>> Trying 130.183.2.117 (port 7001):
>> Free packets: 135/147, packet reclaims: 0, calls: 0, used FDs: 64
>> not waiting for packets.
>> 0 calls waiting for a thread
>> 1 threads are idle
>> 0 calls have waited for a thread
>> rx stats: free packets 135, allocs 1, alloc-failures(rcv 0/0,send 0/0,ack 0)
>>    greedy 0, bogusReads 0 (last from host 0), noPackets 0, noBuffers 0, selects
>> 0, sendSelects 0
>>    packets read: data 0 ack 0 busy 0 abort 0 ackall 0 challenge 0 response 0
>> debug 2 params 0 unused 0 unused 0 unused 0 version 0
>>    other read counters: data 0, ack 0, dup 0 spurious 0 dally 0
>>    packets sent: data 0 ack 0 busy 0 abort 0 ackall 0 challenge 0 response 0
>> debug 0 params 0 unused 0 unused 0 unused 0 version 0
>>    other send counters: ack 0, data 0 (not resends), resends 0, pushed 0,
>> acked&ignored 0
>>         (these should be small) sendFailed 0, fatalErrors 0
>>    0 server connections, 0 client connections, 0 peer structs, 0 call structs, 0
>> freecall structs
>> Done.
>> /afs/notebook/.cs/openafs-1.5/@sys/src/rxdebug:
>>
>>
>> ~/a/n: read_test 12mb
>> open of 12mb took 0.011 sec.
>> read of 12000000 bytes took 88.792 sec.
>> close took 0.001 sec.
>> Total data rate = 132 Kbytes/sec. for read
>> ~/a/n:
>>
>>
>> /afs/notebook/.cs/openafs-1.5/@sys/src/rxdebug: ./rxdebug reuter 7001 -rxstat
>> -peer -long -onlyport 7000 -onlyhost reuter2
>>
>> Trying 130.183.2.117 (port 7001):
>>
>> Free packets: 237/243, packet reclaims: 7171, calls: 8, used FDs: 64
>>
>> not waiting for packets.
>>
>> 0 calls waiting for a thread
>>
>> 1 threads are idle
>>
>> 0 calls have waited for a thread
>>
>> rx stats: free packets 237, allocs 31575, alloc-failures(rcv 0/0,send 0/0,ack 0)
>>
>>    greedy 0, bogusReads 0 (last from host 0), noPackets 0, noBuffers 262,
>> selects 0, sendSelects 0
>>
>>    packets read: data 16562 ack 10 busy 0 abort 0 ackall 0 challenge 2 response
>> 0 debug 18 params 0 unused 0 unused 0 unused 0 version 0
>>
>>    other read counters: data 16562, ack 10, dup 0 spurious 0 dally 0
>>
>>    packets sent: data 87 ack 15187 busy 0 abort 0 ackall 0 challenge 0 response
>> 2 debug 0 params 0 unused 0 unused 0 unused 0 version 0
>>
>>    other send counters: ack 15187, data 87 (not resends), resends 0, pushed 0,
>> acked&ignored 2
>>
>>         (these should be small) sendFailed 0, fatalErrors 0
>>
>>    Average rtt is 0.000, with 2 samples
>>
>>    Minimum rtt is 0.000, maximum is 0.000
>>
>>    4 server connections, 4 client connections, 3 peer structs, 10 call structs,
>> 7 free call structs
>>
>> Showing only connections from host 130.183.2.114
>>
>> Showing only connections on port 7000
>>
>> Connection from host 130.183.2.114, port 7000, Cuid b3aecd24/2f905844
>>
>>   serial 2,  natMTU 1444, flags DESTROYED, security index 0, client conn
>>
>>     call 0: # 1, state dally, mode: receiving, flags: receive_done
>>
>>     call 1: # 0, state not initialized
>>     call 2: # 0, state not initialized
>>     call 3: # 0, state not initialized
>> Connection from host 130.183.2.114, port 7000, Cuid b3aecd24/8a90da68
>>   serial 14327,  natMTU 1444, flags pktCksum, security index 2, client conn
>>   rxkad: level clear, flags pktCksum
>>   Received 12010472 bytes in 8520 packets
>>   Sent 1552 bytes in 50 packets
>>     call 0: # 50, state dally, mode: receiving, flags: receive_done
>>     call 1: # 0, state not initialized
>>     call 2: # 0, state not initialized
>>     call 3: # 0, state not initialized
>> Done.
>> Peer at host 130.183.2.114, port 7000
>>         ifMTU 1444      natMTU 1444     maxMTU 1444
>>         packets sent 55 packet resends 0
>>         bytes sent high 0 low 877037
>>         bytes received high 0 low 0
>>         rtt 1 msec, rtt_dev 0 msec
>>         timeout 0.350 sec
>>         in/out packet skew: 0/0
>>         congestion window 2, MTU 1444
>>         current/if/max jumbogram size: 1/4/1
>> Done.
>> /afs/notebook/.cs/openafs-1.5/@sys/src/rxdebug:
>>
>> This was without -rxpck 500.
>> Now the same with -rxpck 500 :
>>
>>
>>
>> Directly after the client started:
>>
>> /afs/notebook/.cs/openafs-1.5/@sys/src/rxdebug: ./rxdebug reuter 7001
>> -rxstatTrying 130.183.2.117 (port 7001):
>> Free packets: 135/147, packet reclaims: 0, calls: 0, used FDs: 64
>> not waiting for packets.
>> 0 calls waiting for a thread
>> 1 threads are idle
>> 0 calls have waited for a thread
>> rx stats: free packets 135, allocs 1, alloc-failures(rcv 0/0,send 0/0,ack 0)
>>    greedy 0, bogusReads 0 (last from host 0), noPackets 0, noBuffers 0, selects
>> 0, sendSelects 0
>>    packets read: data 0 ack 0 busy 0 abort 0 ackall 0 challenge 0 response 0
>> debug 2 params 0 unused 0 unused 0 unused 0 version 0
>>    other read counters: data 0, ack 0, dup 0 spurious 0 dally 0
>>    packets sent: data 0 ack 0 busy 0 abort 0 ackall 0 challenge 0 response 0
>> debug 0 params 0 unused 0 unused 0 unused 0 version 0
>>    other send counters: ack 0, data 0 (not resends), resends 0, pushed 0,
>> acked&ignored 0
>>         (these should be small) sendFailed 0, fatalErrors 0
>>    0 server connections, 0 client connections, 0 peer structs, 0 call structs, 0
>> freecall structs
>> Done.
>> /afs/notebook/.cs/openafs-1.5/@sys/src/rxdebug:
>>
>>
>>
>> ~/a/n: read_test 12mb
>> open of 12mb took 0.025 sec.
>> read of 12000000 bytes took 98.830 sec.
>> close took 0.001 sec.
>> Total data rate = 119 Kbytes/sec. for read
>> ~/a/n:
>>
>>
>>
>> /afs/notebook/.cs/openafs-1.5/@sys/src/rxdebug: ./rxdebug reuter 7001 -rxstat
>> -peer -long -onlyport 7000 -onlyhost reuter2
>>
>> Trying 130.183.2.117 (port 7001):
>>
>> Free packets: 253/259, packet reclaims: 6827, calls: 8, used FDs: 64
>>
>> not waiting for packets.
>>
>> 0 calls waiting for a thread
>>
>> 1 threads are idle
>>
>> 0 calls have waited for a thread
>>
>> rx stats: free packets 253, allocs 30819, alloc-failures(rcv 0/0,send 0/0,ack 0)
>>
>>    greedy 0, bogusReads 0 (last from host 0), noPackets 0, noBuffers 254,
>> selects 0, sendSelects 0
>>
>>    packets read: data 16210 ack 10 busy 0 abort 0 ackall 0 challenge 2 response
>> 0 debug 5 params 0 unused 0 unused 0 unused 0 version 0
>>
>>    other read counters: data 16210, ack 10, dup 0 spurious 0 dally 0
>>
>>    packets sent: data 87 ack 14775 busy 0 abort 0 ackall 0 challenge 0 response
>> 2 debug 0 params 0 unused 0 unused 0 unused 0 version 0
>>
>>    other send counters: ack 14775, data 87 (not resends), resends 0, pushed 0,
>> acked&ignored 2
>>
>>         (these should be small) sendFailed 0, fatalErrors 0
>>
>>    Average rtt is 0.000, with 2 samples
>>
>>    Minimum rtt is 0.000, maximum is 0.000
>>
>>    4 server connections, 3 client connections, 3 peer structs, 10 call structs,
>> 9 free call structs
>> Showing only connections from host 130.183.2.114
>> Showing only connections on port 7000
>> Connection from host 130.183.2.114, port 7000, Cuid 990a5587/b5c51428
>>   serial 13707,  natMTU 1444, flags pktCksum, security index 2, client conn
>>   rxkad: level clear, flags pktCksum
>>   Received 12010472 bytes in 8520 packets
>>   Sent 1552 bytes in 50 packets
>>     call 0: # 50, state dally, mode: receiving, flags: receive_done
>>     call 1: # 0, state not initialized
>>     call 2: # 0, state not initialized
>>     call 3: # 0, state not initialized
>> Done.
>> Peer at host 130.183.2.114, port 7000
>>         ifMTU 1444      natMTU 1444     maxMTU 1444
>>         packets sent 55 packet resends 0
>>         bytes sent high 0 low 830411
>>         bytes received high 0 low 0
>>         rtt 1 msec, rtt_dev 0 msec
>>         timeout 0.350 sec
>>         in/out packet skew: 0/0
>>         congestion window 2, MTU 1444
>>         current/if/max jumbogram size: 1/4/1
>> Done.
>> /afs/notebook/.cs/openafs-1.5/@sys/src/rxdebug:
>>
>>
>> Looks like -rxpck 500 would not have any effect.
>>
>> Thank you and good luck,
>> Hartmut
>> -----------------------------------------------------------------
>> Hartmut Reuter                  e-mail 		reuter@rzg.mpg.de
>> 			   	phone 		 +49-89-3299-1328
>> 			   	fax   		 +49-89-3299-1301
>> RZG (Rechenzentrum Garching)   	web    http://www.rzg.mpg.de/~hwr
>> Computing Center of the Max-Planck-Gesellschaft (MPG) and the
>> Institut fuer Plasmaphysik (IPP)
>> -----------------------------------------------------------------
> 


-- 
-----------------------------------------------------------------
Hartmut Reuter                  e-mail 		reuter@rzg.mpg.de
			   	phone 		 +49-89-3299-1328
			   	fax   		 +49-89-3299-1301
RZG (Rechenzentrum Garching)   	web    http://www.rzg.mpg.de/~hwr
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------