[OpenAFS] Linux kernel panic, OpenAFS client, gconf
Jan-Marc Pilawa
j.pilawa@tu-bs.de
Thu, 17 Jun 2004 15:41:14 +0200
Hello *,
I read some threads about this problem in the archives, but till now I have no
clue how to solve the frequent client crashes on SMP-systems. The problem is
always triggered by gconfd-2 (At least the problem was only one time
triggered by another application (mozilla)).
I upgraded from openafs-1.2.10 to 1.2.11 (on SuSE 9.0, Kernel 2.4.21-xxx) and
applied a patch from Chas Williams for osi_vnodeops.c, but it is almost the
same. The Situation is improved sofar that in some cases afsd seems to hang
and the applications produce very high load, because they can't access afs.
In most cases the systems produce oopses like the following one (here the
output from ksymoops of the kernel panic):
TT3<1>Unable to handle kernel paging request at virtual address ffffffff
c6148b50
*pde = 00006063
Oops: 0002 2.4.21-226-smp4G #1 SMP Tue Jun 15 10:28:32 UTC 2004
CPU: 1
EIP: 0010:[<c6148b50>] Tainted: P
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010292
eax: 00000003 ebx: ea16dec8 ecx: 00000046 edx: c032d058
esi: faba5c34 edi: 00000001 ebp: fa4fe660 esp: ea16de58
ds: 0018 es: 0018 ss: 0018
Process gconfd-2 (pid: 15463, stackpage=ea16d000)
Stack: c616d490 e8298b20 ea16deec ea16decc ea16dee8 ea16def8 ea16dec0 c6129419
c616d490 e8298b20 ea16deec ea16decc fa6043f8 00000008 ea16dee8 00000001
00000001 00000000 00000000 00000000 00000000 e8298b20 00000000 00000006
Call Trace: [<c616d490>] (28) [<c6129419>] (04) [<c616d490>] (76)
[<c6120e89>] (92) [<c6175664>] (12) [<c6175664>] (08) [<c6158b71>] (48)
[<c0163742>] (32) [<c016512f>] (60) [<c0109637>] (60)
Code: c6 05 ff ff ff ff 2a 83 c4 1c c3 90 8d 74 26 00 b8 76 d9 16
>>EIP; c6148b50 <[libafs]osi_Panic+20/60> <=====
>>ebx; ea16dec8 <[ax25]ax25_table_size+3191a58/ae43bf0>
>>edx; c032d058 <log_wait+0/c>
>>esp; ea16de58 <[ax25]ax25_table_size+31919e8/ae43bf0>
Trace; c616d490 <[libafs].rodata.end+4fe5/cb95>
Trace; c6129419 <[libafs]afs_lookup+fb9/1250>
Trace; c616d490 <[libafs].rodata.end+4fe5/cb95>
Trace; c6120e89 <[libafs]afs_access+f9/390>
Trace; c6175664 <[libafs]afs_global_lock+0/1c>
Trace; c6175664 <[libafs]afs_global_lock+0/1c>
Trace; c6158b71 <[libafs]afs_linux_lookup+61/1c0>
Trace; c0163742 <lookup_hash+c2/120>
Trace; c016512f <sys_unlink+8f/130>
Trace; c0109637 <system_call+33/38>
Code; c6148b50 <[libafs]osi_Panic+20/60>
00000000 <_EIP>:
Code; c6148b50 <[libafs]osi_Panic+20/60> <=====
0: c6 05 ff ff ff ff 2a movb $0x2a,0xffffffff <=====
Code; c6148b57 <[libafs]osi_Panic+27/60>
7: 83 c4 1c add $0x1c,%esp
Code; c6148b5a <[libafs]osi_Panic+2a/60>
a: c3 ret
Code; c6148b5b <[libafs]osi_Panic+2b/60>
b: 90 nop
Code; c6148b5c <[libafs]osi_Panic+2c/60>
c: 8d 74 26 00 lea 0x0(%esi,1),%esi
Code; c6148b60 <[libafs]osi_Panic+30/60>
10: b8 76 d9 16 00 mov $0x16d976,%eax
The systems crash most likely around 12am, but i saw them crashing at other
times, too. At that time many users are logged in. I can login remote or at
the console as root and /sbin/reboot -f still works, thats fine -at least for
me, but not for about a dozen of users ;-).
Mit freundlichen Gruessen / Sincerely
Jan Pilawa
--
+ Kontakt ----------------------------------------------------+
+ Systembetreuung Rechenzentrum TU Braunschweig +
+ Hans-Sommer-Str. 65, D-38092 Braunschweig +
+ Tel: +49 531 391-5548 E-Mail: j.pilawa@tu-bs.de ____________+