[OpenAFS] Suspect AFS bottlenecks on a web server

Jason Edgecombe jason@rampaginggeek.com
Tue, 17 Nov 2009 17:09:12 -0500


Hi Everyone,

Our webserver has been brought to a crawl many times over the last few
weeks. I suspect it's an AFS bottleneck somewhere. I appreciate any help
I can get.

The web server runs solaris 9 w/openafs 1.4.1.

We have three apache install running, two of them hit 74 and 75
concurrent threads and the system load shot up to 76.

This is the afsd config according to the "ps" command:
    /usr/vice/etc/afsd -nosettime -blocks 817388 -stat 2800 -dcache 2400
-daemons 5

a snippet from dmesg (lots of these):
Nov 17 16:29:26 coe-web afs: WARNING: afs_ufswr vcp=30004616878, exOrW=0
Nov 17 16:35:56 coe-web last message repeated 150 times
Nov 17 16:36:06 coe-web afs: WARNING: afs_ufswr vcp=30004616878, exOrW=0
Nov 17 16:42:41 coe-web last message repeated 155 times
Nov 17 16:42:46 coe-web afs: WARNING: afs_ufswr vcp=30004616878, exOrW=0
Nov 17 16:49:16 coe-web last message repeated 152 times
Nov 17 16:49:26 coe-web afs: WARNING: afs_ufswr vcp=30004616878, exOrW=0

% rxdebug myhost -port 7001 -nodally
Trying 196.168.179.13 (port 7001):
Free packets: 212, packet reclaims: 979, calls: 56240332, used FDs: 64
not waiting for packets.
5 calls waiting for a thread  #<---- this is what suggests a bottle neck
to me.
0 threads are idle
Connection from host 196.168.181.174, port 7000, Cuid bdc13423/254695c4
  serial 30,  natMTU 1444, security index 0, client conn
    call 0: # 3, state active, mode: receiving, flags: reader_wait,
has_output_packets
    call 1: # 0, state not initialized
    call 2: # 0, state not initialized
    call 3: # 0, state not initialized
Connection from host 196.168.93.185, port 7000, Cuid bdc13423/254695c8
  serial 88,  natMTU 1444, security index 0, client conn
    call 0: # 1, state active, mode: receiving
    call 1: # 0, state not initialized
    call 2: # 0, state not initialized
    call 3: # 0, state not initialized
Connection from host 196.168.93.184, port 7000, Cuid b3925733/2c003168
  serial 15,  natMTU 1444, security index 0, server conn
    call 0: # 1, state precall, mode: error, flags: waiting_for_process
receive_done, has_input_packets
    call 1: # 0, state not initialized
    call 2: # 0, state not initialized
    call 3: # 0, state not initialized
Connection from host 196.168.93.184, port 7000, Cuid b3925733/2c00316c
  serial 15,  natMTU 1444, security index 0, server conn
    call 0: # 1, state precall, mode: error, flags: waiting_for_process
receive_done, has_input_packets
    call 1: # 0, state not initialized
    call 2: # 0, state not initialized
    call 3: # 0, state not initialized
Connection from host 196.168.93.185, port 7000, Cuid ba5bbc4a/2bd7c668
  serial 588,  natMTU 1444, security index 0, server conn
    call 0: # 5, state active, mode: error
    call 1: # 4, state not initialized
    call 2: # 4, state not initialized
    call 3: # 2, state not initialized
Connection from host 196.168.93.185, port 7000, Cuid ba5bbc4a/2bd7c688
  serial 51,  natMTU 1444, security index 0, server conn
    call 0: # 5, state precall, mode: receiving, flags:
waiting_for_process receive_done, has_input_packets
    call 1: # 0, state not initialized
    call 2: # 0, state not initialized
    call 3: # 0, state not initialized
Connection from host 196.168.93.184, port 7000, Cuid b3925733/2bff7ec8
  serial 13,  natMTU 1444, security index 0, server conn
    call 0: # 268, state precall, mode: eof, flags: waiting_for_process
receive_done, has_input_packets
    call 1: # 0, state not initialized
    call 2: # 0, state not initialized
    call 3: # 0, state not initialized
Connection from host 196.168.181.174, port 7000, Cuid b4a85658/2bde9ab0
  serial 4631,  natMTU 1444, security index 0, server conn
    call 0: # 4402, state precall, mode: error, flags:
waiting_for_process receive_done, has_input_packets
    call 1: # 1, state not initialized
    call 2: # 1, state not initialized
    call 3: # 0, state not initialized
Done.
Skipped 4 dallying connections.

Sincerely,
Jason