[OpenAFS] afs memcache tuning... lockups in afs_cv_wait
Mike Polek
mike@pictage.com
Wed, 13 Feb 2008 18:17:38 -0800
Mike Garrison wrote:
>
> What parameters are you using for the client? What's rxdebug <client> -
> port 7001 -rxstats show?
>
> --
> Mike Garrison
# cmdebug localhost -cache
Chunk files: 40960
Stat caches: 50000
Data caches: 40960
Volume caches: 512
Chunk size: 16384
Cache size: 655360 kB
Set time: no
Cache type: memory
Basically
afsd -memcache -blocks 655360 -chunksize 14 -stat 50000 -daemons 6
-volumes 512 -nosettime
vmlinuz vmalloc=848M (A little tight, I know...)
MemTotal: 2071764 kB
MemFree: 77012 kB
Buffers: 0 kB
Cached: 1258180 kB
Active: 701412 kB
Inactive: 599428 kB
HighTotal: 1916864 kB
HighFree: 7296 kB
LowTotal: 154900 kB
LowFree: 69716 kB
VmallocTotal: 851960 kB
VmallocUsed: 833472 kB
VmallocChunk: 16740 kB
RX stats when throttled and humming along nicely:
# rxdebug localhost -port 7001 -rxstats | head -15
Trying 192.168.11.19 (port 7001):
Free packets: 155, packet reclaims: 0, calls: 228, used FDs: 64
not waiting for packets.
0 calls waiting for a thread
1 threads are idle
rx stats: free packets 155, allocs 1954462, alloc-failures(rcv 0/0,send
266493/0,ack 0)
greedy 0, bogusReads 0 (last from host 0), noPackets 0, noBuffers 0,
selects 0, sendSelects 0
packets read: data 16843 ack 580519 busy 0 abort 3 ackall 0 challenge
191 response 0 debug 217 params 0 unused 0 unused 0 unused 0 version 0
other read counters: data 16843, ack 575197, dup 0 spurious 5317 dally 5
packets sent: data 486114 ack 20105 busy 0 abort 0 ackall 0 challenge 0
response 191 debug 0 params 0 unused 0 unused 0 unused 0 version 0
other send counters: ack 20105, data 3835588 (not resends), resends 292,
pushed 0, acked&ignored 1970250
(these should be small) sendFailed 0, fatalErrors 0
Average rtt is 0.001, with 191368 samples
Minimum rtt is 0.000, maximum is 39.537
22 server connections, 198 client connections, 22 peer structs, 239 call
structs, 145 free call structs
RX stats under heavy load:
Trying 192.168.11.19 (port 7001):
Free packets: 9, packet reclaims: 2, calls: 434, used FDs: 64
not waiting for packets.
0 calls waiting for a thread
1 threads are idle
rx stats: free packets 9, allocs 3892227, alloc-failures(rcv 0/0,send
281445/0,ack 0)
greedy 0, bogusReads 0 (last from host 0), noPackets 0, noBuffers 6,
selects 0, sendSelects 0
packets read: data 22528 ack 1169510 busy 0 abort 3 ackall 0 challenge
374 response 0 debug 414 params 0 unused 0 unused 0 unused 0 version 0
other read counters: data 22528, ack 1158659, dup 0 spurious 10842 dally 9
packets sent: data 970891 ack 26381 busy 0 abort 2 ackall 0 challenge 0
response 374 debug 0 params 0 unused 0 unused 0 unused 0 version 0
other send counters: ack 26381, data 7687708 (not resends), resends 788,
pushed 0, acked&ignored 4089184
(these should be small) sendFailed 0, fatalErrors 0
Average rtt is 0.001, with 393261 samples
Minimum rtt is 0.000, maximum is 39.537
17 server connections, 232 client connections, 32 peer structs, 239 call
structs, 32 free call structs
RX stats after a lockup occurs:
Trying 192.168.11.19 (port 7001):
Free packets: 290, packet reclaims: 3, calls: 434, used FDs: 64
not waiting for packets.
0 calls waiting for a thread
1 threads are idle
rx stats: free packets 290, allocs 3919137, alloc-failures(rcv 0/0,send
302633/0,ack 0)
greedy 0, bogusReads 0 (last from host 0), noPackets 0, noBuffers 11,
selects 0, sendSelects 0
packets read: data 22611 ack 1177559 busy 0 abort 199 ackall 0 challenge
374 response 0 debug 509 params 0 unused 0 unused 0 unused 0 version 0
other read counters: data 22611, ack 1166647, dup 0 spurious 10903 dally 9
packets sent: data 977805 ack 26791 busy 0 abort 2 ackall 0 challenge 0
response 374 debug 0 params 0 unused 0 unused 0 unused 0 version 0
other send counters: ack 26791, data 7740578 (not resends), resends 788,
pushed 0, acked&ignored 4106054
(these should be small) sendFailed 0, fatalErrors 0
Average rtt is 0.001, with 395696 samples
Minimum rtt is 0.000, maximum is 39.537
15 server connections, 232 client connections, 32 peer structs, 239 call
structs, 138 free call structs
And the counters pretty much grind to a halt.... nothing much moves...
A little while later it shows:
Trying 192.168.11.19 (port 7001):
Free packets: 290, packet reclaims: 3, calls: 442, used FDs: 64
not waiting for packets.
0 calls waiting for a thread
1 threads are idle
rx stats: free packets 290, allocs 3919169, alloc-failures(rcv 0/0,send
302633/0,ack 0)
greedy 0, bogusReads 0 (last from host 0), noPackets 0, noBuffers 11,
selects 0, sendSelects 0
packets read: data 22623 ack 1177571 busy 0 abort 199 ackall 0 challenge
374 response 0 debug 572 params 0 unused 0 unused 0 unused 0 version 0
other read counters: data 22623, ack 1166659, dup 0 spurious 10903 dally 9
packets sent: data 977817 ack 26799 busy 0 abort 2 ackall 0 challenge 0
response 374 debug 0 params 0 unused 0 unused 0 unused 0 version 0
other send counters: ack 26799, data 7740602 (not resends), resends 788,
pushed 0, acked&ignored 4106058
(these should be small) sendFailed 0, fatalErrors 0
Average rtt is 0.001, with 395696 samples
Minimum rtt is 0.000, maximum is 39.537
14 server connections, 183 client connections, 32 peer structs, 239 call
structs, 237 free call structs
The free call structs goes up... server connections down a bit...
but no major shifts that I can see. Machine was rebooted right before
the tests. I didn't reboot between the throttled test and the overload test.
Anything look interesting to you?
Thanks!!
Mike
--
Michael Polek
Director of System Operations
1580 Francisco Street, Suite 101
Torrance, CA 90501
Phone: (310) 525-1600 ext. 628
Email: mike@pictage.com
http://www.pictage.com