[OpenAFS] Re: Openafs 1.3.78 and kernel 2.4.29 oopses , same for 2.4.30 and openafs 1.3.82
ted creedon
tcreedon@easystreet.com
Mon, 9 May 2005 06:25:13 -0700
Looks like a compile problem if there's a symbol table error.
To eliminate that as a cause:
Make bzImage;make modules;make modules_install;make install;
Reboot into the new image
Run regen.sh then ./configure and built a new openafs system; install ane
test it.
I think there may be small differences in the m4 macros between various
operating systems.
This is the only way I can get reliable compiles. I have had one server
crash with 1.3.81 but I suspect the software raid filesystem.
tedc
-----Original Message-----
From: openafs-info-admin@openafs.org [mailto:openafs-info-admin@openafs.org]
On Behalf Of Dimitris Zilaskos
Sent: Monday, May 09, 2005 3:04 AM
To: Willy Tarreau
Cc: Marcelo Tosatti; openafs-info@openafs.org; linux-kernel@vger.kernel.org
Subject: [OpenAFS] Re: Openafs 1.3.78 and kernel 2.4.29 oopses , same for
2.4.30 and openafs 1.3.82
Hello ,
I have just completed 44 hours of uptime with 2.4.30 and openafs 1.3.78 .
Two oops occured at night , but the system did not freeze :
a)
ksymoops 2.4.11 on i686 2.4.30. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.30/ (default)
-m /usr/src/linux/System.map (default)
Warning: You did not tell me where to find symbol information. I will
assume that the log matches the kernel and modules that are running right
now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get more
accurate output by telling me the kernel version and where to find map,
modules, ksyms etc. ksymoops -h explains the options.
Warning (compare_maps): libafs-2.4.30.mp symbol kallsyms_address_to_symbol
not found in /usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry Warning (compare_maps):
libafs-2.4.30.mp symbol kallsyms_symbol_to_address not found in
/usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry Warning (compare_maps):
libafs-2.4.30.mp symbol sys_chdir not found in
/usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry Warning (compare_maps):
libafs-2.4.30.mp symbol sys_exit not found in
/usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry Warning (compare_maps):
libafs-2.4.30.mp symbol sys_ioctl not found in
/usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry Warning (compare_maps):
libafs-2.4.30.mp symbol sys_open not found in
/usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry Warning (compare_maps):
libafs-2.4.30.mp symbol sys_wait4 not found in
/usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry Warning (compare_maps):
libafs-2.4.30.mp symbol sys_write not found in
/usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry May 8 04:03:23 system
kernel: Unable to handle kernel NULL pointer dereference at virtual address
00000039 May 8 04:03:23 system kernel: c014d763 May 8 04:03:23 system
kernel: *pde = 00000000 May 8 04:03:23 system kernel: Oops: 0000
May 8 04:03:23 system kernel: CPU: 0
May 8 04:03:23 system kernel: EIP: 0010:[<c014d763>] Tainted: P
Using defaults from ksymoops -t elf32-i386 -a i386 May 8 04:03:23 system
kernel: EFLAGS: 00010213 May 8 04:03:23 system kernel: Unable to handle
kernel NULL pointer dereference at virtual address 00000039 May 8 04:03:23
system kernel: c014d2c0 May 8 04:03:23 system kernel: *pde = 00000000
May 8 04:03:23 system kernel: eax: 00000001 ebx: 00000001 ecx: f0ab6664
edx: ee91bb40
May 8 04:03:23 system kernel: esi: e5559009 edi: 00000001 ebp: c9403f40
esp: c9403f28
May 8 04:03:23 system kernel: ds: 0018 es: 0018 ss: 0018
May 8 04:03:23 system kernel: Process rsync (pid: 25555,
stackpage=c9403000) May 8 04:03:23 system kernel: Stack: d66a1c40 c9403f40
00000008 00000008 c9403f98 00000001 e5559000 00000009
May 8 04:03:23 system kernel: 0fbf25c9 bfff7950 c9403f98 e5559000
00000000 00000008 c014de69 e5559000
May 8 04:03:23 system kernel: e5559000 c9403f98 c014e1c9 bfff7950
c0152fa0 c9402000 c9403f98 bfff8960
May 8 04:03:23 system kernel: Call Trace: [<c014de69>] [<c014e1c9>]
[<c0152fa0>] [<c014a06f>] [<c0108ebb>]
May 8 04:03:23 system kernel: Code: 8b 7b 38 85 ff 0f 84 8e 00 00 00 f0 fe
0d e0 c9 41 c0 0f 88
>>EIP; c014d763 <link_path_walk+5c3/ac0> <=====
>>edx; ee91bb40 <_end+2e4b3700/3042bc20> esi; e5559009
>><_end+250f0bc9/3042bc20> ebp; c9403f40 <_end+8f9bb00/3042bc20> esp;
>>c9403f28 <_end+8f9bae8/3042bc20>
Trace; c014de69 <path_lookup+39/40>
Trace; c014e1c9 <__user_walk+49/60>
Trace; c0152fa0 <filldir64+0/130>
Trace; c014a06f <sys_lstat64+1f/90>
Trace; c0108ebb <system_call+33/38>
Code; c014d763 <link_path_walk+5c3/ac0> 00000000 <_EIP>:
Code; c014d763 <link_path_walk+5c3/ac0> <=====
0: 8b 7b 38 mov 0x38(%ebx),%edi <=====
Code; c014d766 <link_path_walk+5c6/ac0>
3: 85 ff test %edi,%edi
Code; c014d768 <link_path_walk+5c8/ac0>
5: 0f 84 8e 00 00 00 je 99 <_EIP+0x99>
Code; c014d76e <link_path_walk+5ce/ac0>
b: f0 fe 0d e0 c9 41 c0 lock decb 0xc041c9e0
Code; c014d775 <link_path_walk+5d5/ac0>
12: 0f 88 00 00 00 00 js 18 <_EIP+0x18>
9 warnings issued. Results may not be reliable.
b)
ksymoops 2.4.11 on i686 2.4.30. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.30/ (default)
-m /usr/src/linux/System.map (default)
Warning: You did not tell me where to find symbol information. I will
assume that the log matches the kernel and modules that are running right
now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get more
accurate output by telling me the kernel version and where to find map,
modules, ksyms etc. ksymoops -h explains the options.
Warning (compare_maps): libafs-2.4.30.mp symbol kallsyms_address_to_symbol
not found in /usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry Warning (compare_maps):
libafs-2.4.30.mp symbol kallsyms_symbol_to_address not found in
/usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry Warning (compare_maps):
libafs-2.4.30.mp symbol sys_chdir not found in
/usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry Warning (compare_maps):
libafs-2.4.30.mp symbol sys_exit not found in
/usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry Warning (compare_maps):
libafs-2.4.30.mp symbol sys_ioctl not found in
/usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry Warning (compare_maps):
libafs-2.4.30.mp symbol sys_open not found in
/usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry Warning (compare_maps):
libafs-2.4.30.mp symbol sys_wait4 not found in
/usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry Warning (compare_maps):
libafs-2.4.30.mp symbol sys_write not found in
/usr/local/lib/openafs/libafs-2.4.30.mp.o. Ignoring
/usr/local/lib/openafs/libafs-2.4.30.mp.o entry May 8 04:03:23 system
kernel: Oops: 0000
May 8 04:03:23 system kernel: CPU: 1
May 8 04:03:23 system kernel: EIP: 0010:[<c014d2c0>] Tainted: P
Using defaults from ksymoops -t elf32-i386 -a i386 May 8 04:03:23 system
kernel: EFLAGS: 00010213
May 8 04:03:23 system kernel: eax: 00000001 ebx: 00000001 ecx: f0ad16d0
edx: ee91bb40
May 8 04:03:23 system kernel: esi: d06da04b edi: 00000001 ebp: c9ccdf40
esp: c9ccdf28
May 8 04:03:23 system kernel: ds: 0018 es: 0018 ss: 0018
May 8 04:03:23 system kernel: Process lftp (pid: 24957, stackpage=c9ccd000)
May 8 04:03:23 system kernel: Stack: d091c0e0 c9ccdf40 00000004 00000009
c9ccdf98 00000001 d06da047 00000003
May 8 04:03:23 system kernel: 0012238c 08390ea8 c9ccdf98 d06da000
00000000 00000009 c014de69 d06da000
May 8 04:03:23 system kernel: d06da000 c9ccdf98 c014e1c9 08390ea8
fffffffb c9ccc000 c9ccdf98 bffffb50
May 8 04:03:23 system kernel: Call Trace: [<c014de69>] [<c014e1c9>]
[<c0149fdf>] [<c0108ebb>]
May 8 04:03:23 system kernel: Code: 8b 43 38 85 c0 0f 84 8e 00 00 00 f0 fe
0d e0 c9 41 c0 0f 88
>>EIP; c014d2c0 <link_path_walk+120/ac0> <=====
>>edx; ee91bb40 <_end+2e4b3700/3042bc20> esi; d06da04b
>><_end+10271c0b/3042bc20> ebp; c9ccdf40 <_end+9865b00/3042bc20> esp;
>>c9ccdf28 <_end+9865ae8/3042bc20>
Trace; c014de69 <path_lookup+39/40>
Trace; c014e1c9 <__user_walk+49/60>
Trace; c0149fdf <sys_stat64+1f/90>
Trace; c0108ebb <system_call+33/38>
Code; c014d2c0 <link_path_walk+120/ac0> 00000000 <_EIP>:
Code; c014d2c0 <link_path_walk+120/ac0> <=====
0: 8b 43 38 mov 0x38(%ebx),%eax <=====
Code; c014d2c3 <link_path_walk+123/ac0>
3: 85 c0 test %eax,%eax
Code; c014d2c5 <link_path_walk+125/ac0>
5: 0f 84 8e 00 00 00 je 99 <_EIP+0x99>
Code; c014d2cb <link_path_walk+12b/ac0>
b: f0 fe 0d e0 c9 41 c0 lock decb 0xc041c9e0
Code; c014d2d2 <link_path_walk+132/ac0>
12: 0f 88 00 00 00 00 js 18 <_EIP+0x18>
9 warnings issued. Results may not be reliable.
Again that happened at the time my afs server restarts . Just before the
oops :
May 8 04:03:20 system kernel: afs: Lost contact with file server
1.2.3.4 in cell cell.gr (all multi-homed ip
addresses down for the server)
May 8 04:03:20 system kernel: afs: Lost contact with file server
1.2.3.4 in cell cell.gr (all multi-homed ip
addresses down for the server)
May 8 04:03:21 system kernel: afs: failed to store file (110)
Looks like identical behaviour as in 2.4.29 with 1.3.78. From what I
have observed it seems that those oopses eventually will lead to all
processes accesing AFS entering D state ,till the system freezes.
Something in openafs 1.3.82 seems to accelarate this process. I am thinking
of running a cron job that checks for D state processes and kills eveyrthing
apart from the absolutetely essential processes if say more than 50 enter D
state, in an attemp to prevent total system freeze , and give 2.4.30 and
1.3.82 another try. Unless you guys have a better
suggestion:)
Regards ,
--
============================================================================
=
Dimitris Zilaskos
Department of Physics @ Aristotle University of Thessaloniki , Greece PGP
key : http://tassadar.physics.auth.gr/~dzila/pgp_public_key.asc
http://egnatia.ee.auth.gr/~dzila/pgp_public_key.asc
MD5sum : de2bd8f73d545f0e4caf3096894ad83f pgp_public_key.asc
============================================================================
=
_______________________________________________
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info