[OpenAFS-devel] [CSL #301175] openafs-1.4.1 kernel panics

thomas@cs.wisc.edu thomas@cs.wisc.edu
Wed, 24 May 2006 14:22:33 -0500


I running into client kernel panics on both 2.4.21 (like RHEL3) and
2.6.9 (like RHEL 4) kernels. The failures seem to be osi_Alloc
failing to allocate memory and returning NULL. Details follow. Is
anyone with deep(er than me) knowledge of the cache manager aware of
such things, or a fix?

I've done some reading on vmalloc= kernel invocation parameter. vmalloc
data isn't reported in /proc/meminfo on the 2.4.21 system, but is on the
2.6.9 system. Is vmalloc=?? a recommended way to deal with this?

Thanks much.

Dave Thompson
UW-Madison

The 2.4.21 kernel host is a mail server, dual Xeon with 8GB of memory.
The messages are:

afs_osi_Alloc: Can't vmalloc 4524 bytes.
crget: No more memory for creds!
Unable to handle kernel NULL pointer dereference at virtual
address 00000000
 printing eip:
f8ddf800
*pde = 2192b001
*pte = 69107067
Oops: 0002
sg ide-scsi openafs iptable_filter ip_tables e1000 microcode keybdev
mousedev hid input usb-uhci usbcore ext3 jbd 3w-9xxx sd_mod scsi_mod
CPU: 3
EIP: 0060:[<f8ddf800>] Tainted: P
EFLAGS: 00010282

EIP is at osi_Panic [openafs] 0x20 (2.4.21-40.ELsmp/i686)
eax: 00000021 ebx: f3d82000 ecx: 00000000 edx: c0387e98
esi: eefe9019 edi: 00000000 ebp: f8e19a40 esp: f3d83e9c
ds: 0068 es: 0068 ss: 0068
Process imapd (pid: 24128, stackpage=f3d83000)
Stack: f8e04c78 416ab962 00000296 f1a20c80 000066fc 00000001
00000001 f8dea664

       f8e04c78 416ab962 00000296 f1a20c80 f3d82000 f3d82000
       f3d82000 f8dea71f
       00000001 00000001 eefe9019 f8df0bd7 c0456780 f0c3f280
       f18db6ec f3d82000
[<f8dea664>] crget [openafs] 0xe4 (0xf3d83eb8)
[<f8e04c78>] .rodata.str1.4 [openafs] 0x437c (0xf3d83ebc)
[<f8dea71f>] crref [openafs] 0xf (0xf3d83ed8)
[<f8df0bd7>] afs_linux_permission [openafs] 0x17 (0xf3d83ee8)
[<c017391f>] permission [kernel] 0x4f (0xf3d83f08)
[<f8def7a0>] afs_linux_lookup [openafs] 0x0 (0xf3d83f14)
[<c0173de6>] link_path_walk [kernel] 0x76 (0xf3d83f18)
[<c0174779>] path_lookup [kernel] 0x39 (0xf3d83f58)
[<c0174ac9>] __user_walk [kernel] 0x49 (0xf3d83f68)
[<c016f6ee>] sys_stat64 [kernel] 0x2e (0xf3d83f84)
[<c02af06f>] no_timing [kernel] 0x7 (0xf3d83fc0)

(See openafs-1.4.1/src/afs/LINUX/osi_cred.c).


The 2.6.9 case is on a desktop and is in a different part of the
cache manager:

May 23 04:19:41 HOSTNAME kernel: allocation failed: out of vmalloc space
- use vmalloc=<size> to increase size.
May 23 04:19:50 HOSTNAME last message repeated 9 times
May 23 04:19:51 HOSTNAME kernel: afs_osi_Alloc: Can't vmalloc
12084 bytes.
May 23 04:19:51 HOSTNAME kernel: rxi_Alloc error<1>Unable to handle
kernel NULL pointer dereference at virtual address 00000000
May 23 04:19:51 HOSTNAME kernel: printing eip:
May 23 04:19:51 HOSTNAME kernel: printing eip:
May 23 04:19:51 HOSTNAME kernel: f8c745c8
May 23 04:19:51 HOSTNAME kernel: *pde = 00000000
May 23 04:19:51 HOSTNAME kernel: *pde = 00000000
May 23 04:19:51 HOSTNAME kernel: Oops: 0002 [#1]
May 23 04:19:51 HOSTNAME kernel: Oops: 0002 [#1]
May 23 04:19:51 HOSTNAME kernel: PREEMPT SMP
May 23 04:19:51 HOSTNAME kernel: Modules linked in: vmnet(U) vmmon(U)
parport_pc lp parport md5 ipv6 libafs(U) iptable_filter ip_tables sr_mod
ide_scsi x
fs pcspkr dm_mirror dm_mod button battery ac uhci_hcd ehci_hcd
nvidia(U) shpchp snd_azx snd_hda_codec snd_pcm_oss snd_mixer_oss
snd_pcm snd_timer snd so
undcore snd_page_alloc e100 mii floppy ext3 jbd ata_piix libata
sd_mod scsi_mod
May 23 04:19:51 HOSTNAME kernel: CPU: 1
May 23 04:19:51 HOSTNAME kernel: EIP: 0060:[<f8c745c8>] Tainted: PF VLI
May 23 04:19:51 HOSTNAME kernel: EFLAGS: 00210246
(2.6.9-34.106.unsupportedsmp)
May 23 04:19:51 HOSTNAME kernel: EIP is at osi_Panic+0x17/0x23 [libafs]
May 23 04:19:51 HOSTNAME kernel: eax: 0000000f ebx: f8c94bc1 ecx:
d8606cbc edx: f8c946e4
May 23 04:19:51 HOSTNAME kernel: esi: 00002f34 edi: ef6acc2c ebp:
f46b18c0 esp: d8606cb8
May 23 04:19:51 HOSTNAME kernel: ds: 007b es: 007b ss: 0068
May 23 04:19:51 HOSTNAME kernel: Process procmail (pid: 7474,
threadinfo=d8606000 task=d80cc030)
May 23 04:19:51 HOSTNAME kernel: Stack: f8c946e4 f8c95a80 00002f34
00000000 00000000 f8c6ccf5 00000000 00000000
May 23 04:19:51 HOSTNAME kernel: ef6acc20 ef6acc2c f8c7b835 ef6acc20
febc3750 00000006 f40c7924 00000000
May 23 04:19:51 HOSTNAME kernel: f40c7900 f11adec0 d8606000 f46b18c0
f8c3e808 00000038 c8874580 f8ca8b80
May 23 04:19:51 HOSTNAME kernel: Call Trace:
May 23 04:19:51 HOSTNAME kernel: [<f8c6ccf5>] rxi_Alloc+0x3d/0x5e
[libafs]
May 23 04:19:51 HOSTNAME kernel: [<f8c7b835>]
rxkad_NewClientSecurityObject+0x46/0x100 [libafs]
May 23 04:19:51 HOSTNAME kernel: [<f8c3e808>]
afs_ConnBySA+0x2eb/0x412 [libafs]
May 23 04:19:51 HOSTNAME kernel: [<f8c3e4ff>]
afs_Conn+0x147/0x165 [libafs]
May 23 04:19:51 HOSTNAME kernel: [<f8c54a4f>]
afs_FetchStatus+0x2b/0x4dc [libafs]
May 23 04:19:51 HOSTNAME kernel: [<c01a8aaa>] avc_has_perm+0x3b/0x45
May 23 04:19:51 HOSTNAME kernel: [<f8c3daa0>]
afs_TraverseCells_nl+0x1f/0x2f [libafs]
May 23 04:19:51 HOSTNAME kernel: [<f8c3db72>]
afs_choose_cell_by_num+0x0/0xd [libafs]
May 23 04:19:51 HOSTNAME kernel: [<f8c3dafb>]
afs_TraverseCells+0x4b/0x97 [libafs]
May 23 04:19:51 HOSTNAME kernel: [<f8c3dcca>]
afs_GetCellStale+0x26/0x2b [libafs]
May 23 04:19:51 HOSTNAME kernel: [<f8c3dd52>]
afs_IsPrimaryCellNum+0x16/0x1c [libafs]
May 23 04:19:51 HOSTNAME kernel: [<f8c55465>]
afs_FindVCache+0x2eb/0x304 [libafs]
May 23 04:19:51 HOSTNAME kernel: [<f8c55b67>]
afs_GetAccessBits+0xab/0xc0 [libafs]
May 23 04:19:51 HOSTNAME kernel: [<c016d7e8>] __d_lookup+0xfb/0x12d
May 23 04:19:51 HOSTNAME kernel: [<c01642c3>] do_lookup+0x70/0x8f
May 23 04:19:51 HOSTNAME kernel: [<c016ca5b>] dput+0x2f/0x1a2
May 23 04:19:51 HOSTNAME kernel: [<f8c55c0c>]
afs_AccessOK+0x90/0x12b [libafs]
May 23 04:19:51 HOSTNAME kernel: [<f8c55f66>]
afs_access+0x2bf/0x2e9 [libafs]
May 23 04:19:51 HOSTNAME kernel: [<f8c82ddc>]
afs_linux_permission+0x0/0xe9 [libafs]
May 23 04:19:51 HOSTNAME kernel: [<f8c82e73>]
afs_linux_permission+0x97/0xe9 [libafs]
May 23 04:19:51 HOSTNAME kernel: [<f8c82ddc>]
afs_linux_permission+0x0/0xe9 [libafs]
May 23 04:19:51 HOSTNAME kernel: [<c0163ee0>] permission+0x2b/0x4f
May 23 04:19:51 HOSTNAME kernel: [<c01656c8>] may_open+0x53/0x21a
May 23 04:19:51 HOSTNAME kernel: [<c0165b37>] open_namei+0x2a8/0x56f
May 23 04:19:51 HOSTNAME kernel: [<c0157faa>] filp_open+0x45/0x70
May 23 04:19:51 HOSTNAME kernel: [<c01c1c90>]
direct_strncpy_from_user+0x39/0x58
May 23 04:19:51 HOSTNAME kernel: [<c0158346>] sys_open+0x31/0x7d
May 23 04:19:51 HOSTNAME kernel: [<c02d86e7>] syscall_call+0x7/0xb
May 23 04:19:51 HOSTNAME kernel: Code: e8 42 ff ff ff 89 d8 5b 5e c3
0f b7 d0 31 c0 e9 d3 ff ff ff 53 85 c0 bb c1 4b c9 f8 ff 74 24 08 0f
44 c3 51 52 50
e8 2a dd 4a c7 <c6> 05 00 00 00 00 00 83 c4 10 5b c3 57 89 c7 83 c8
ff 56 89 d6
May 23 04:19:51 HOSTNAME kernel: <0>Fatal exception: panic in 5 seconds

(See openafs-1.4.1/src/rx/rx.c)