[OpenAFS] GCPAGs not working? Kernel panic in afs_pag_destroy

Bob Hoffman hoffman@cs.pitt.edu
Tue, 04 Mar 2008 15:54:10 -0500


I'm having much the same problem that Mike Polek reported recently.  We 
are suffering frequent kernel panics that appear to be the result of 
PAGs not being GC'd.

Version info:  Red Hat Enterprise 4 Linux, x86_64 architecture, kernel 
2.6.9-42.0.8.ELsmp
OpenAFS 1.4.6 installed from the openafs-1.4.6-el4.2.* RPMs.
The startup script has "sysctl -w afs.GCPAGs=1"

The kernel panic is not written to any log file, so I took a photo of 
the screen and transcribed the last 24 lines here:

R10: 0000000100000000 R11: 0000ffff803fd180 R12: 00000000ffffffff
R13: 0000000000000286 R14: 0000000000000000 R15: ffffffff801cae8c
FS:  0000000000000000(0000) GS:ffffffff804e5900(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0:000000008005003b
CR2: 000000315f48fae0 CR3: 0000000037ea8000 CR4: 00000000000006e0
Process events /1 (pid: 11, threadinfo 00000108cfed4000, task 00000100cfea3030)
Stack: 0000000000000000 ffffffffa01fd724 00000104132dcf08 00000104132dcf00
       0000010037e61440 ffffffffa02355ea 00000104132dcf08 ffffffff801caf53
       0000000000000000 ffffffff803f15e0
Call Trace:<ffffffffa01fd724>{:libafs:afs_FindUser+212} <ffffffffa02355ea>{:liba
fs:afs_pag_destroy+26}
       <ffffffff801caf53>{key_cleanup+199} <ffffffff80147856>{worker_thread+419}

       <ffffffff80133dad>{default_wake_function+0} <ffffffff80133dfe>{__wake_up_
common+67}
       <ffffffff80133dad>{default_wake_function+0} <ffffffff801476b3>{worker_thr
ead+0}
       <ffffffff8014b4cb>{kthread+200} <ffffffff80110f47>{child_rip+8}
       <ffffffff8014b403>{kthread+0} <ffffffff80110f3f>{child_rip+0}


Code: 0f 0b 4b c0 24 a0 ff ff ff ff c6 00 0f b6 03 a8 01 74 12 48
RIP <ffffffffa01f41fb>{:libafs:Afs_Lock_ReleaseR+59} RSP <00000100cfed5e18>
<0>Kernel panic - not syncing: Oops


Like Mike, I am also seeing /proc/sys/afs/GCPAGs being set to 8.  Should 
I set up a cron job that forces it back to 1 if it sees it change?  Is 
there anything I can instrument to help debug this?

Thanks in advance,

---Bob.