[OpenAFS-devel] OpenAFS 1.2.3 client hangs on Linux - kernels 2.4.2 and 2.4.9

Touretsky, Gregory gregory.touretsky@intel.com
Mon, 4 Mar 2002 22:51:49 +0200


Hi,

  configuring Linux machine as NIS server, we found a strange problem - AFS
hangs if there are several (4) instances of "pwck -r" running
simultaneously. pwck verifies integrity of /etc/passwd, and it stat's all
user home dirs. We have ~3000 unix accounts with home dirs in AFS (each home
directory is a volume).
I succeeded to reproduce this problem running several instances (10+) of the
following short script:
#!/bin/tcsh
#Usage <command> <file with the long list of AFS mount points>
foreach i (`cat $1`)
/bin/ls -ld $i
end

The problem is reproducible on Linux 2.4.2 and 2.4.9 kernels with OAFS
1.2.3, I couldn't reproduce it on 2.4.2 with IBM AFS 3.6 patch 4. 
Here are the last lines from fstrace output:
time 206.474904, pid 1262: Access vp 0xe0aa0000 mode 0x40 len 0x800 
time 206.474904, pid 1262: Access vp 0xe0aa0000 mode 0x40 len 0x800 
time 206.474904, pid 1262: Access vp 0xe0aa05b8 mode 0x40 len 0x1000 
time 206.474904, pid 1262: Access vp 0xe0aa05b8 mode 0x40 len 0x1000 
time 206.474904, pid 1262: Access vp 0xe0ad33b0 mode 0x40 len 0x800 
time 206.474904, pid 1262: Access vp 0xe0ad3598 mode 0x40 len 0x800 
time 206.484904, pid 1256: Analyze RPC op -1 conn 0xd3f5e6c0 code 0x0 user
0x0 
time 206.484904, pid 1256: Mount point is to vp 0xe0bb8938 fid
(1:537094601.42.831) 
time 206.504904, pid 1265: Access vp 0xe0aa0000 mode 0x40 len 0x800 
time 206.504904, pid 1265: Access vp 0xe0aa0000 mode 0x40 len 0

You can see that the last line is incomplete.

Any thoughts?

Gregory Touretsky
Israel Engineering Computing
Unix Server Platforms
gregory.touretsky@intel.com
> (+) 972-4-865-6377, Fax: 04-865-5999
iNET: 465-6377, M/S: IDC-1B