[OpenAFS] AFS hanging issue

Brennan Peter Sellner bsellner@andrew.cmu.edu
Tue, 8 Jan 2002 11:03:30 -0500 (EST)


Hello all,

I'm running OpenAFS 1.2.2 from the RedHat 7.2.1 rpms on RedHat 7.2 with
the RedHat 2.4.9-13 kernel as a client, using CMU's AFS servers.

Symptoms are as follows:
  - I can read and write just fine from the AFS tree, until I attempt to
build a somewhat large source tree residing on AFS (with the resulting
binaries being written to AFS).
  - The build hangs during the first call to the compiler.  Nothing in
/var/log/messages from this fail.
  - After this (even after stop the compile), I can still write new files
to AFS and read all files, but if I attempt to write a modified file back
to AFS, emacs and vi either hang or declare a seg fault.  I've included a
copy of what's written to /var/log/messages as a result of this below.

AFS appears to "reset" itself to the beginning state (able to read/write
files, but becomes unhappy after the attempted build) after some lengthy
amount of time (between 2 and 24 hours; I left it overnight and had
"reset" when I returned).

Rebooting did not resolve the problem.

An attempt to stop afs through the init.d script gave an error about /afs
being busy.

Attempted restarts of afs failed:
	[root@LENNON init.d]# ./afs restart
	Found libafs-2.4.9-13-i386.o from SymTable... Loading...
	Failed to load AFS client, not starting AFS services.

Output of lsmod (during afs unhappiness):
	[root@LENNON init.d]# /sbin/lsmod
	Module                  Size  Used by
	soundcore               4208   0  (autoclean)
	libafs-2.4.9-13-i386  395488   2
	autofs                 11296   0  (autoclean) (unused)
	3c59x                  25216   1
	ipchains               36224   0
	usb-uhci               20736   0  (unused)
	usbcore                49920   1  [usb-uhci]
	ext3                   59792   2
	jbd                    39040   2  [ext3]


Thanks much,

-Brennan Sellner

Portion of /var/log/messages generated when vi hung attempting to write a
small text file to disk.

Jan  7 20:07:09 LENNON kernel:  <1>Unable to handle kernel NULL pointer
dereference at virtual address 00000000
Jan  7 20:07:09 LENNON kernel:  printing eip:
Jan  7 20:07:09 LENNON kernel: c0208e5b
Jan  7 20:07:09 LENNON kernel: *pde = 00000000
Jan  7 20:07:09 LENNON kernel: Oops: 0002
Jan  7 20:07:09 LENNON kernel: CPU:    0
Jan  7 20:07:10 LENNON kernel: EIP:    0010:[__down_write+71/116]    Not
tainted
Jan  7 20:07:10 LENNON kernel: EIP:    0010:[<c0208e5b>]    Not tainted
Jan  7 20:07:10 LENNON kernel: EFLAGS: 00010292
Jan  7 20:07:10 LENNON kernel: eax: c4a7bebc   ebx: c4a7a000   ecx:
00000000   edx: c69b65f4
Jan  7 20:07:10 LENNON kernel: esi: c54e5430   edi: c69b65ec   ebp:
c69b6578   esp: c4a7bebc
Jan  7 20:07:10 LENNON kernel: ds: 0018   es: 0018   ss: 0018
Jan  7 20:07:10 LENNON kernel: Process vi (pid: 14091, stackpage=c4a7b000)
Jan  7 20:07:10 LENNON kernel: Stack: c69b65f4 00000000 c4a7a000 00000002
c69b65d8 c013049e 00000000 00000000
Jan  7 20:07:10 LENNON kernel:        00000000 c2ef0c10 00000002 c3e12090
00000242 c4a7bf7c c68b1f7b c3e12090
Jan  7 20:07:10 LENNON kernel:        00000080 c3e12090 00000002 ffffffeb
c69b6578 00000000 00000242 c4a7bf7c
Jan  7 20:07:10 LENNON kernel: Call Trace: [do_truncate+62/128]
do_truncate [kernel] 0x3e
Jan  7 20:07:10 LENNON kernel: Call Trace: [<c013049e>] do_truncate
[kernel] 0x3e
Jan  7 20:07:10 LENNON kernel:
[ipchains:__insmod_ipchains_S.bss_L44+393819/40271213]
afs_linux_permission [libafs-2.4.9-13-i386] 0x4b
Jan  7 20:07:10 LENNON kernel: [<c68b1f7b>] afs_linux_permission
[libafs-2.4.9-13-i386] 0x4b
Jan  7 20:07:10 LENNON kernel: [open_namei+1095/1432] open_namei [kernel]
0x447
Jan  7 20:07:10 LENNON kernel: [dput+28/348] dput [kernel] 0x1c
Jan  7 20:07:10 LENNON kernel: [<c01420ac>] dput [kernel] 0x1c
Jan  7 20:07:10 LENNON kernel:
[ipchains:__insmod_ipchains_S.bss_L44+391092/40273940] afs_linux_open
[libafs-2.4.9-13-i386] 0x40
Jan  7 20:07:10 LENNON kernel: [<c68b14d4>] afs_linux_open
[libafs-2.4.9-13-i386] 0x40
Jan  7 20:07:10 LENNON kernel: [filp_open+50/80] filp_open [kernel] 0x32
Jan  7 20:07:10 LENNON kernel: [<c013131a>] filp_open [kernel] 0x32
Jan  7 20:07:10 LENNON kernel: [sys_open+51/164] sys_open [kernel] 0x33
Jan  7 20:07:10 LENNON kernel: [<c01315df>] sys_open [kernel] 0x33
Jan  7 20:07:10 LENNON kernel: [system_call+51/56] system_call [kernel]
0x33
Jan  7 20:07:10 LENNON kernel: [<c0106e03>] system_call [kernel] 0x33
Jan  7 20:07:10 LENNON kernel:
Jan  7 20:07:11 LENNON kernel:
Jan  7 20:07:11 LENNON kernel: Code: 89 01 8b 44 24 0c 85 c0 74 16 8d 76
00 e8 c3 92 f0 ff c7 03