[OpenAFS-port-freebsd] Client deadlock?

Garrett Wollman wollman@csail.mit.edu
Wed, 30 Mar 2011 17:15:44 -0400


Running Ben's package of 1.6.0pre4, I have found bonnie++ to be a
sure-fire way of deadlocking (?) the client.  The following client
processes are running:

    0  1121     1   0  44  0  5832  1768 afsslp Ds    ??    0:00.01 /usr/local/sbin/afsd -stat 2800 -daemons 6 -volumes 128 -dynroot -fakestat-all -afsdb -memcache
    0  1117     1   0  46  0  5832  1556 sbwait IL     0    5:39.20 /usr/local/sbin/afsd -stat 2800 -daemons 6 -volumes 128 -dynroot -fakestat-all -afsdb -memcache
    0  1118     1   0  76  0  5832  1556 afs_rx DL     0    0:00.00 /usr/local/sbin/afsd -stat 2800 -daemons 6 -volumes 128 -dynroot -fakestat-all -afsdb -memcache
    0  1119     1   0  44  0  5832  1556 afswai DL     0    0:00.11 /usr/local/sbin/afsd -stat 2800 -daemons 6 -volumes 128 -dynroot -fakestat-all -afsdb -memcache
    0  1122     1   0  44  0  5832  1556 afscon DL     0    0:00.73 /usr/local/sbin/afsd -stat 2800 -daemons 6 -volumes 128 -dynroot -fakestat-all -afsdb -memcache
    0  1123     1   0  76  0  5832  1556 afscon DL     0    0:00.04 /usr/local/sbin/afsd -stat 2800 -daemons 6 -volumes 128 -dynroot -fakestat-all -afsdb -memcache
    0  1124     1   0  48  0  5832  1556 afsslp DL     0    7:07.72 /usr/local/sbin/afsd -stat 2800 -daemons 6 -volumes 128 -dynroot -fakestat-all -afsdb -memcache
    0  1125     1   0  67  0  5832  1556 afsslp DL     0    7:00.29 /usr/local/sbin/afsd -stat 2800 -daemons 6 -volumes 128 -dynroot -fakestat-all -afsdb -memcache
    0  1126     1   0  46  0  5832  1556 afsslp DL     0    6:06.30 /usr/local/sbin/afsd -stat 2800 -daemons 6 -volumes 128 -dynroot -fakestat-all -afsdb -memcache
    0  1127     1   0  49  0  5832  1556 afsslp DL     0    7:09.72 /usr/local/sbin/afsd -stat 2800 -daemons 6 -volumes 128 -dynroot -fakestat-all -afsdb -memcache
    0  1128     1   0  45  0  5832  1556 afsslp DL     0    6:28.13 /usr/local/sbin/afsd -stat 2800 -daemons 6 -volumes 128 -dynroot -fakestat-all -afsdb -memcache
    0  1129     1   0  46  0  5832  1556 afsslp DL     0    6:12.35 /usr/local/sbin/afsd -stat 2800 -daemons 6 -volumes 128 -dynroot -fakestat-all -afsdb -memcache
    0  1130     1   0  76  0  5832  1556 afswai DL     0    0:02.75 /usr/local/sbin/afsd -stat 2800 -daemons 6 -volumes 128 -dynroot -fakestat-all -afsdb -memcache

...but none of them ever seem to get scheduled.  (The bonnie++ process
is also stuck in afsslp.)  I don't have a debugging kernel on this
machine (it's actually only mine for a day to do some performance
testing) so I can't easily get a backtrace.  rxdebug on the server
reports no active connections.  The client *is* working enough to
respond to rxdebug, and reports:

Trying 128.30.2.181 (port 7001):
Free packets: 205, packet reclaims: 8, calls: 41, used FDs: 64
not waiting for packets.
0 calls waiting for a thread
1 threads are idle
Connection from host 128.30.2.188, port 7000, Cuid 8ebfc8f6/3647eba0
  serial 62749,  natMTU 1444, security index 0, client conn
    call 0: # 31372, state dally, mode: receiving, flags: receive_done
    call 1: # 0, state not initialized
    call 2: # 0, state not initialized
    call 3: # 0, state not initialized
Done.

Other more normal activities do not seem to hose the client in this
way.

This is a 16-thread, 8-core, 2-socket Dell server of recent vintage,
with 16G of memory, and it's currently running stock FreeBSD 8.2.

-GAWollman