[OpenAFS-devel] kernel BUG at /scratch/openafs/src/libafs/MODLOAD-2.6.13-MP/rx_kcommon.c:131!

Martin MOKREJŠ mmokrejs@ribosome.natur.cuni.cz
Fri, 02 Sep 2005 10:04:32 +0200


Hi,
  so I take back what I've said about the /usr/afs/etc/KeyFile related bos messages.
I think the problem is that once bosserver found _no_ key (even no KeyFile)
than when the actual key appears it still thinks it's wrong. Anyway, reboot
fixed the "problem". Yes, I'd have used "bos addkey".


  Anyway, I wanted to shutdown the afsd cache and this happened (I think I've already
posted such stacktrace to the list but maybe not?):

# vos create phylo /vicepa i386_linux26'
> vos create phylo /vicepa i386_linux26'.usr
vos: the name of the root volume i386_linux26
vos create phylo /vicepa i386_linux26.usr exceeds the size limit of 22
# grep afsd /etc/conf.d/local.start 
#/usr/vice/etc/afsd -chunk 20 -nosettime -stat 1000 -daemons 12 -dcache 1000 -volumes 10 -files 300000
# vim /etc/conf.d/local.start 
# grep afsd /etc/conf.d/local.start 
/usr/vice/etc/afsd -chunk 20 -nosettime -stat 1000 -daemons 12 -dcache 1000 -volumes 100 -files 300000
# 
# /usr/vice/etc/afsd -shutdown
afsd: Shutting down all afs processes and afs state
afsd: AFS still mounted; Not shutting down
# umount /afs
umount: /afs: device is busy
umount: /afs: device is busy
# cd /
# umount /afs
Segmentation fault
# 

libafs: module license 'http://www.openafs.org/dl/license10.html' taints kernel.
Found system call table at 0xc0471780 (pattern scan)
Starting AFS cache scan...found 0 non-empty cache files (0%).
AFS isn't unmounted yet! Call aborted
inode freed while on LRU------------[ cut here ]------------
kernel BUG at /scratch/openafs/src/libafs/MODLOAD-2.6.13-MP/rx_kcommon.c:131!
invalid operand: 0000 [#1]
SMP 
Modules linked in: libafs ohci_hcd ehci_hcd uhci_hcd
CPU:    0
EIP:    0060:[<fa027eca>]    Tainted: P      VLI
EFLAGS: 00010296   (2.6.13) 
EIP is at osi_Panic+0x28/0x36 [libafs]
eax: 0000001b   ebx: e7d3fd64   ecx: 00000001   edx: 00000286
esi: e7d3fea4   edi: e7dc26e0   ebp: e7057e98   esp: e7057e84
ds: 007b   es: 007b   ss: 0068
Process umount (pid: 8687, threadinfo=e7056000 task=f6787a20)
Stack: fa04cf74 fa034ff3 00000000 e7d3fd64 e7d3fd64 e7057ea4 fa035017 e7d3fd64 
       e7057eb8 c01745fe f6ff2d64 e7d3fd64 f6e2dae8 e7057ec8 c01754d4 e7d3fd64 
       e7d3fd64 e7057ed0 c0175506 e7057edc c017555e e7056000 e7057eec fa036ff9 
Call Trace:
 [<c01039d1>] show_stack+0x7a/0x90
 [<c0103b52>] show_registers+0x152/0x1ca
 [<c0103d60>] die+0xf4/0x183
 [<c0103e70>] do_trap+0x81/0xb8
 [<c010414a>] do_invalid_op+0xa3/0xad
 [<c010363b>] error_code+0x4f/0x54
 [<fa035017>] afs_clear_inode+0x24/0x3e [libafs]
 [<c01745fe>] clear_inode+0xc7/0xc9
 [<c01754d4>] generic_forget_inode+0x113/0x12f
 [<c0175506>] generic_drop_inode+0x16/0x18
 [<c017555e>] iput+0x56/0x69
 [<fa036ff9>] afs_dentry_iput+0x7c/0x97 [libafs]
 [<c01727b9>] dput+0x157/0x1dd
 [<c0162ba3>] generic_shutdown_super+0x39/0x140
 [<c016348d>] kill_anon_super+0xc/0x35
 [<c0162aab>] deactivate_super+0x58/0x71
 [<c0176a9d>] __mntput+0x28/0x33
 [<c01694a2>] path_release_on_umount+0x29/0x2c
 [<c0177001>] sys_umount+0x37/0x76
 [<c0177059>] sys_oldumount+0x19/0x1b
 [<c0102acb>] sysenter_past_esp+0x54/0x75
Code: ff 5d c3 55 89 e5 53 bb a4 c4 04 fa 83 ec 10 85 c0 0f 44 c3 8b 5d 08 89 4c 24 08 89 5c 24 0c 89 54 24 04 89 04 24 e8 3a 8d 0f c6 <0f
> 0b 83 00 28 98 04 fa 83 c4 10 5b 5d c3 55 83 fa 01 89 e5 57 
 

I'm not sure if these are related to it or not, but I found in FileLog this:
Fri Sep  2 09:39:44 2005 File Server started Fri Sep  2 09:39:44 2005
Fri Sep  2 09:39:44 2005 Set thread id 15 for 'FiveMinuteCheckLWP'
Fri Sep  2 09:39:44 2005 Set thread id 16 for 'HostCheckLWP'
Fri Sep  2 09:39:44 2005 Set thread id 17 for 'FsyncCheckLWP'
Fri Sep  2 09:48:12 2005 fssync: volume 536870916 restored; breaking all call backs
Fri Sep  2 09:48:15 2005 fssync: volume 536870913 restored; breaking all call backs

and in VolserLog this:
Fri Sep  2 09:39:47 2005 Starting AFS Volserver 2.0 (/usr/afs/bin/volserver)
Fri Sep  2 09:45:34 2005 1 Volser: CreateVolume: volume 536870915 (root.cell) created
Fri Sep  2 09:45:34 2005 1 Volser: Clone: Cloning volume 536870912 to new volume 536870913
Fri Sep  2 09:45:35 2005 1 Volser: Clone: Cloning volume 536870915 to new volume 536870916
Fri Sep  2 09:45:56 2005 1 Volser: CreateVolume: volume 536870918 (home) created
Fri Sep  2 09:45:56 2005 1 Volser: CreateVolume: volume 536870921 (home.mmokrejs) created
Fri Sep  2 09:46:20 2005 1 Volser: CreateVolume: volume 536870924 (home.flegr) created
Fri Sep  2 09:48:12 2005 1 Volser: Clone: Recloning volume 536870915 to volume 536870916
Fri Sep  2 09:48:15 2005 1 Volser: Clone: Recloning volume 536870912 to volume 536870913


The machine is a dual Xeon host on linux-2.6.13 SMP+HIGHMEM kernel without PREEMPT.