[OpenAFS] Problem with afs_syscall

Christian Pfaffel flash@itp.tu-graz.ac.at
26 Aug 2004 09:41:39 +0200


Hello everyone!

I have got the following problem on a couple of more or less heavily
loaded application servers all running OpenAFS version 1.2.11 on
debian x86 woody.

Once the AFS client has been running for some time, a machine tends to
block for some time completely. This happens whenever someone executes a program
using afs_syscall for converting a kerberos ticket to an AFS token.

To clearify with an example, here is what i can reproduce with aklog.

# kinit user
# time strace -c aklog
execve("/usr/bin/aklog", ["aklog"], [/* 45 vars */]) = 0
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 98.60    0.899662      449831         2           afs_syscall
  0.69    0.006296        1574         4           select
  0.18    0.001617          20        82        41 open
  0.16    0.001446           9       168           read
[..snip..]
  0.00    0.000007           4         2           time
  0.00    0.000007           7         1           getuid32
------ ----------- ----------- --------- --------- ----------------
100.00    0.912437                   631        50 total

real    0m0.941s
user    0m0.030s
sys     0m0.900s

Subsequent calls to aklog show a much quicker response, as expected.

# time strace -c aklog
execve("/usr/bin/aklog", ["aklog"], [/* 50 vars */]) = 0
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 32.61    0.001148          19        61        35 open
[..snip..]
  0.68    0.000024          24         1           afs_syscall
[..snip..]
  0.06    0.000002           2         1           getuid32
------ ----------- ----------- --------- --------- ----------------
100.00    0.003520                   396        47 total

real    0m0.022s
user    0m0.010s
sys     0m0.000s

When the client is freshly restarted (including kernel module reloading)
the output of kinit && aklog looks like

# time strace -c aklog
execve("/usr/bin/aklog", ["aklog"], [/* 50 vars */]) = 0
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 43.89    0.005531        1383         4           select
 14.69    0.001851          22        86        45 open
 11.40    0.001437           9       169           read
[..snip..]
  2.73    0.000344         172         2           afs_syscall
  2.33    0.000294           6        49           close
------ ----------- ----------- --------- --------- ----------------
100.00    0.012603                   640        59 total

real    0m0.043s
user    0m0.020s
sys     0m0.030s

Any comment would be well appreciated,

regards,
Christian

-- 
Christian Pfaffel <flash@itp.tu-graz.ac.at>
Technische Universität Graz                 Telefon: +43 / 316 / 873 - 81 90
Institut für Theoretische Physik            Telefax: +43 / 316 / 873 - 86 78
Petersgasse 16, A-8010 Graz   http://fubphpc.tu-graz.ac.at/~flash/pubkey.gpg