[OpenAFS-devel] 1.4.2 panicking Redhat ES3

Derrick J Brashear shadow@dementia.org
Thu, 8 Feb 2007 10:12:47 -0500 (EST)


there is a fix for that going into 1.4.3rc2, as it happens.

On Thu, 8 Feb 2007, Joe Buehler wrote:

> I have a machine that is panicking every night at the same time -- an automated
> build system kicks off at 19:30 and starts hammering the AFS client on the machine.
>
> This just started happening about a week ago, the machine was fine before that.
> Two things have been done recently in the cell:
>
> - we now use a Kerberos V server for authentication purposes (using
>  backwards compatibility -- we are still using klog)
> - one of the DB servers (the lowest IP number) was moved to a new machine
>  (the new lowest IP number)
>
> Here is the syslog info.  The pj process checks files out of an RCS repository
> in AFS.
>
> Feb  7 19:31:14 rocky kernel: assertion failed: code != -EAGAIN, file:
> /home/project-releases/tmp/openafs-1.4.2/src/afs/LINUX/osi_vnodeops.c, line: 484
> Feb  7 19:31:14 rocky kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000
> Feb  7 19:31:14 rocky kernel:  printing eip:
> Feb  7 19:31:14 rocky kernel: f8c05880
> Feb  7 19:31:14 rocky kernel: *pde = 26173001
> Feb  7 19:31:14 rocky kernel: *pte = 00000000
> Feb  7 19:31:14 rocky kernel: Oops: 0002
> Feb  7 19:31:14 rocky kernel: nfs nfsd lockd sunrpc libafs-2.4.21-4.ELsmp.mp parport_pc lp parport autofs e1000 microcode loop
> keybdev mousedev hid input ehci-hcd usb-uhci usbcore ext3 jbd
> Feb  7 19:31:14 rocky kernel: CPU:    1
> Feb  7 19:31:14 rocky kernel: EIP:    0060:[<f8c05880>]    Tainted: PF
> Feb  7 19:31:14 rocky kernel: EFLAGS: 00010282
> Feb  7 19:31:14 rocky kernel:
> Feb  7 19:31:14 rocky kernel: EIP is at osi_Panic [libafs-2.4.21-4.ELsmp.mp] 0x20 (2.4.21-4.ELsmp)
> Feb  7 19:31:14 rocky kernel: eax: 0000007a   ebx: e8787ac0   ecx: 00000000   edx: c0380e14
> Feb  7 19:31:14 rocky kernel: esi: f8c2646f   edi: e8787b3b   ebp: 00000079   esp: e8787a50
> Feb  7 19:31:14 rocky kernel: ds: 0068   es: 0068   ss: 0068
> Feb  7 19:31:14 rocky kernel: Process pj (pid: 19918, stackpage=e8787000)
> Feb  7 19:31:14 rocky kernel: Stack: e8787ac0 00000010 000001e4 00000000 00000006 f5d368e0 e8786000 f8c05b04
> Feb  7 19:31:14 rocky kernel:        e8787ac0 00000010 000001e4 00000000 00000000 00000000 00000000 00000000
> Feb  7 19:31:14 rocky kernel:        00000000 00000002 bfffbb3a 00000202 00000202 00000202 00000040 de6ac6f5
> Feb  7 19:31:14 rocky kernel: Call Trace:   [<f8c05b04>] osi_AssertFailK [libafs-2.4.21-4.ELsmp.mp] 0x1d4 (0xe8787a6c)
> Feb  7 19:31:14 rocky kernel: [<f8bc10e9>] afs_TraverseCells_nl [libafs-2.4.21-4.ELsmp.mp] 0x29 (0xe8787b3c)
> Feb  7 19:31:14 rocky kernel: [<f8bc11f0>] afs_choose_cell_by_num [libafs-2.4.21-4.ELsmp.mp] 0x0 (0xe8787b50)
> Feb  7 19:31:14 rocky kernel: [<f8bc113d>] afs_TraverseCells [libafs-2.4.21-4.ELsmp.mp] 0x3d (0xe8787b5c)
> Feb  7 19:31:14 rocky kernel: [<f8bc11f0>] afs_choose_cell_by_num [libafs-2.4.21-4.ELsmp.mp] 0x0 (0xe8787b60)
> Feb  7 19:31:14 rocky kernel: [<f8bc13f0>] afs_GetCellStale [libafs-2.4.21-4.ELsmp.mp] 0x30 (0xe8787b7c)
> Feb  7 19:31:14 rocky kernel: [<f8bc14c2>] afs_IsPrimaryCellNum [libafs-2.4.21-4.ELsmp.mp] 0x22 (0xe8787b9c)
> Feb  7 19:31:14 rocky kernel: [<f8bc10e9>] afs_TraverseCells_nl [libafs-2.4.21-4.ELsmp.mp] 0x29 (0xe8787bac)
> Feb  7 19:31:14 rocky kernel: [<f8bddb13>] afs_FindVCache [libafs-2.4.21-4.ELsmp.mp] 0x73 (0xe8787bbc)
> Feb  7 19:31:15 rocky kernel: [<f8bc113d>] afs_TraverseCells [libafs-2.4.21-4.ELsmp.mp] 0x3d (0xe8787bcc)
> Feb  8 09:35:57 rocky syslogd 1.4.1: restart.
> Feb  8 09:35:57 rocky syslog: syslogd startup succeeded
>
> The CellServDB file has:
>
>> hekimian.com  #Spirent Communications Rockville, MD division
> 10.32.90.51    #bullwinkle.hekimian.com
> 10.32.90.94    #crater.hekimian.com
> 10.32.90.99    #cetus.hekimian.com
>
> The ThisCell file has:
>
> hekimian.com
>
> The afsd options are:
>
> /usr/vice/etc/afsd -stat 10000 -dcache 2400 -daemons 5 -volumes 128 -nosettime -afsdb -dynroot -fakestat -afsdb -dynroot
>
> fs listcells shows the following for hekimian.com:
>
> Cell hekimian.com on hosts crater.hekimian.com bullwinkle.hekimian.com cetus.hekimian.com.
> -- 
> Joe Buehler
>
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel
>