[OpenAFS] Windows client network behaviour

Thu, 22 Sep 2011 11:08:10 +0200

On 09/21/2011 05:23 PM, Jeffrey Altman wrote:
> On 9/21/2011 7:09 AM, Anders Magnusson wrote:
>    
>> In the hunt for oddities regarding the new IFS Windows client I have
>> observed a problem causing bad performance, and hopefully someone has
>> some idea about what is going on.
>>
>> Environment:
>> Server:  OpenAFS 1.4.12.1, CentOS 5.3
>> Client: Windows 7, OpenAFS 1.7.1
>>
>> The test case is to write an ISO image (700MB) to afs from local disk.
>>      
> What size is the cache?  Is the ISO larger than the cache?
>    
Cache size 2Gbyte, so cache larger than file.

> What is the chunksize?
>    
Unchanged.
> What is the blocksize?
>    
Unchanged. Eh, how is this parameter changed???  Can't find it in the 
documentation.
>    
>> If the switch port is set to 100Mbit I will get ~3Mbyte/s, but if it is
>> set to 1Gbit then I get ~10Mbyte/s.
>> Both these numbers are much lower than they should be, and more
>> precisely I cannot understand why the speed in 100Mbit configuration
>> becomes much lower than when using 1Gbit.
>>      
> More than likely it is because the RPC round trip time is slower and
> therefore the latency is longer.
>    
rxstats says (after two writes of the iso, asking the client):

C:\Users\Administrator>rxdebug -rxstats 130.240.42.9 7001
Trying 130.240.42.9 (port 7001):
Free packets: 1276/2562, packet reclaims: 0, calls: 3318, used FDs: 0
not waiting for packets.
0 calls waiting for a thread
6 threads are idle
0 calls have waited for a thread
rx stats: free packets 1276, allocs 1170493, alloc-failures(rcv 0/0,send 
0/0,ack 0)
    greedy 0, bogusReads 0 (last from host 0), noPackets 0, noBuffers 0, 
selects 0, sendSelects 0
    packets read: data 52621 ack 536170 busy 0 abort 2 ackall 0 
challenge 111 response 0 debug 5 params 0 unused 0 unused 0 unused 0 
version 0
    other read counters: data 52621, ack 536139, dup 0 spurious 31 dally 0
    packets sent: data 1086992 ack 52651 busy 0 abort 0 ackall 0 
challenge 0 response 111 debug 0 params 0 unused 0 unused 0 unused 0 
version 0
    other send counters: ack 52651, data 1086973 (not resends), resends 
19, pushed 0, acked&ignored 56101
         (these should be small) sendFailed 0, fatalErrors 0
    Average rtt is 0.001, with 1034074 samples
    Minimum rtt is 0.001, maximum is 0.125
    19 server connections, 0 client connections, 8 peer structs, 28 call 
structs, 28 free call structs
    0 clock updates
Done.

I think the rtt seems quite low...?
>> Before someone asks; there are no network limits here and both client
>> and server are on the same subnet.
>>
>> I have run tcpdump on both client and server and seen this traffic
>> "pattern":
>>
>> For 100Mbit:
>> - A data packet is sent out periodically at an almost exact rate of one
>> 1472 byte
>>    per 420 microseconds, which gives something close to 3Mbyte/s
>>
>> For 1Gbit:
>> - The same as for 100Mbit except for that the packet rate is one packet
>> per 91 microseconds.
>>
>> The ack packet from the file server is sent back 12 microseconds after
>> each second data packet.
>>      
> How long does it take for each each StoreData RPC to complete?
>    
Is there any good way to dig that out?
Anyway, here's a snippet from tcpdump output on the windows machine; 
130.240.42.9 is the client, 130.240.42.222 is the server.
The packet ack comes back in some microseconds.

09:40:52.626621 IP (tos 0x0, ttl 128, id 24472, offset 0, flags [DF], 
proto: UDP (17), length: 1472, bad cksum 0 (->3acd)!) 130.240.42.9.7001 
 > 130.240.42.222.7000:  rx data seq 8 ser 216 (1444)
09:40:52.626711 IP (tos 0x0, ttl 128, id 24473, offset 0, flags [DF], 
proto: UDP (17), length: 1472, bad cksum 0 (->3acc)!) 130.240.42.9.7001 
 > 130.240.42.222.7000:  rx data seq 9 ser 217 (1444)
09:40:52.626774 IP (tos 0x0, ttl  64, id 60100, offset 0, flags [none], 
proto: UDP (17), length: 93) 130.240.42.222.7000 > 130.240.42.9.7001:  
rx ack seq 0 ser 112 first 8 serial 215 reason idle (65)
09:40:52.626803 IP (tos 0x0, ttl 128, id 24474, offset 0, flags [DF], 
proto: UDP (17), length: 1472, bad cksum 0 (->3acb)!) 130.240.42.9.7001 
 > 130.240.42.222.7000:  rx data seq 10 ser 218 (1444)
09:40:52.626897 IP (tos 0x0, ttl 128, id 24475, offset 0, flags [DF], 
proto: UDP (17), length: 1472, bad cksum 0 (->3aca)!) 130.240.42.9.7001 
 > 130.240.42.222.7000:  rx data seq 11 ser 219 (1444)
09:40:52.626963 IP (tos 0x0, ttl  64, id 60101, offset 0, flags [none], 
proto: UDP (17), length: 93) 130.240.42.222.7000 > 130.240.42.9.7001:  
rx ack seq 0 ser 113 first 10 serial 217 reason idle (65)
09:40:52.626990 IP (tos 0x0, ttl 128, id 24476, offset 0, flags [DF], 
proto: UDP (17), length: 1472, bad cksum 0 (->3ac9)!) 130.240.42.9.7001 
 > 130.240.42.222.7000:  rx data seq 12 ser 220 (1444)
09:40:52.627081 IP (tos 0x0, ttl 128, id 24477, offset 0, flags [DF], 
proto: UDP (17), length: 1472, bad cksum 0 (->3ac8)!) 130.240.42.9.7001 
 > 130.240.42.222.7000:  rx data seq 13 ser 221 (1444)
09:40:52.627145 IP (tos 0x0, ttl  64, id 60102, offset 0, flags [none], 
proto: UDP (17), length: 93) 130.240.42.222.7000 > 130.240.42.9.7001:  
rx ack seq 0 ser 114 first 12 serial 219 reason idle (65)


>> I have uninstalled the QoS module on the Windows interface.
>>
>> Any hints anyone? I think this smells as traffic shaping due to the
>> quite exact transmit rate but
>> since the QoS module is uninstalled and the behaviour is seen on the
>> windows network interface
>> I have no clue where it may be.
>>
>> A side note: Going via a SMB-AFS gateway on the same network gives
>> significantly better
>> performance.
>>      
> The SMB client behavior is very different.  The SMB redirector sends
> data in 64K chunks to the SMB server which is then written to the file
> server semi-synchronously.  As a result there is much less pressure on
> the cache regardless of size.  For the IFS client at present, all 700MB
> will go into the windows page cache and it will swallow the entire AFS
> cache at once.  Things degrade at that point waiting for each RPC to
> complete in order to make more room for new data.
>    
What I meant here is that the gateway is a Centos 5 machine running
Samba and OpenAFS 1.4.10 and store data in AFS.  So, writing to the
same place in AFS via this machine is significantly faster than using
the IFS client. This is just to show that the performance problem isn't
there if using the Linux client.

> If your cache size is large enough and the file servers are responsive,
> then it is possible to obtain 40MB/sec write speeds on 1Gbit links.  I
> am aware of where the bottlenecks are but it is going to take time for
> me to address them.
>    
Well, this write speed would be sufficient, but we are not close to it 
right now :-)

-- Ragge

> I will refer people to a blog post I wrote back in March 2008
>
> http://blog.secure-endpoints.com/2008/03/i-want-my-openafs-windows-client-to-be.html
>
> Jeffrey Altman
>
>