[Port-solaris] Kernel panic after a few minutes

Sebastian Hanigk shanigk@fs.tum.de
Mon, 24 Oct 2011 08:00:23 +0200


Good morning,

I have installed OpenAFS 1.6.0 on a test client running Solaris 10 with =
the newest patches, but after a few minutes of file system usage, the =
kernel panics. This happens with the binary release from the web site as =
well as with a compiled version from the sources (Solaris Studio 12.2 =
compiler). Using the NFS-enabled or -disabled kernel modules makes no =
difference. Machine is a Dell PowerEdge 2950, RAID controller disabled =
and ZFS as root file system. AFS cache (2048M) is on a UFS file system =
mounted under /var/vice/cache.

For startup, I'm using the OpenAFS SMF scripts from Mathias Feiler (Uni =
Hohenheim, =
<https://www.uni-hohenheim.de/~feiler/wiki/aktiv/doku.php?id=3Dsys:var:sol=
:zoned_afs_server:afs-client_installation>), the afsd startup command is =
"/usr/vice/etc/afsd -stat 2000 -dcache 800 -daemons 3 -volumes 70 -afsdb =
-backup". On a side note: as I understand it, this should start 3 =
daemons, ps output lists 9 processes.

As I'm not quite used to kernel debugging, perhaps some of you can shed =
some light on the matter. Its sister machine running as AFS server =
without client functionality runs perfectly.

Here is the kernel panic console log:

panic[cpu3]/thread=3Dffffffff9f7e2a80: BAD TRAP: type=3De (#pf Page =
fault) rp=3Dfffffe

afsd: #pf Page fault
Bad kernel fault at addr=3D0x0
pid=3D921, pc=3D0xfffffffffb844f56, sp=3D0xfffffe8001707908, =
eflags=3D0x10217
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: =
6f8<xmme,fxsr,pge,mce,pae,pse,de>
cr2: 0 cr3: 4366b1000 cr8: c
        rdi:                0 rsi: ffffffffad492250 rdx: =
fffffffffbc054ff
        rcx:                0  r8:                0  r9: =
ffffffff9f021ef0
        rax:               ff rbx:                0 rbp: =
fffffe8001707940
        r10:               76 r11:         ffffffff r12: =
ffffffffad492250
        r13: fffffffffbcec218 r14: fffffffffbcec210 r15: =
ffffffffad4997c8
        fsb:                0 gsb: ffffffff9e980800  ds:               =
43
         es:               43  fs:                0  gs:              =
1c3
        trp:                e err:                2 rip: =
fffffffffb844f56
         cs:               28 rfl:            10217 rsp: =
fffffe8001707908
         ss:               30

fffffe8001707720 unix:die+da ()
fffffe8001707800 unix:trap+5e6 ()
fffffe8001707810 unix:cmntrap+140 ()
fffffe8001707940 unix:lock_try+6 ()
fffffe80017079b0 genunix:turnstile_block+19e ()
fffffe8001707a10 unix:mutex_vector_enter+249 ()
fffffe8001707a70 afs:rx_NewCall+3a ()
fffffe8001707b10 afs:RXAFS_GetTime+25 ()
fffffe8001707c90 afs:afs_CheckServers+1dda ()
fffffe8001707cd0 afs:afs_CheckServerDaemon+130 ()
fffffe8001707d80 afs:afs_syscall_call+29a ()
fffffe8001707e20 afs:Afs_syscall+c1 ()
fffffe8001707e60 genunix:syscall_ap+97 ()
fffffe8001707ec0 genunix:loadable_syscall+14d ()
fffffe8001707f10 unix:brand_sys_sysenter+1f2 ()


Best regards,

Sebastian=