[OpenAFS-devel] System lockup with do_IRQ: stack overflow

Deon George deon@wurley.net
Mon, 26 Feb 2007 12:51:04 +1100


chas williams - CONTRACTOR wrote:
> i can't think of one.  the stack warning is given at a predetermined
> value.  you can't lower it to get a warning sooner.  however, you
> could start running your "heavy i/o" and get the results from sysrq-t.
> DEBUG_STACK_USAGE will show the min free stack on each thread.  this might
> point someone in the right direction.
>   
OK, first test run, I got this - no SYSREQ keys worked - system was 
locked up.

do_IRQ: stack overflow: 492
 [<c04051ba>] dump_trace+0x69/0x1af
 [<c0405318>] show_trace_log_lvl+0x18/0x2c
 [<c04058cc>] show_trace+0xf/0x11
 [<c04059c9>] dump_stack+0x15/0x17
 [<c0406866>] do_IRQ+0x69/0xbc
 [<c04049fa>] common_interrupt+0x1a/0x20
DWARF2 unwinder stuck at common_interrupt+0x1a/0x20

The kernel is RHEL5 beta 2:
[deon@rhel5dev ~]$ uname -a
Linux rhel5dev.wurley.vpn 2.6.18-1.2747.el5 #1 SMP Thu Nov 9 18:55:30 
EST 2006 i686 athlon i386 GNU/Linux

openafs from atrpms:
[deon@rhel5dev ~]$ rpm -qa |grep -i openafs
openafs-client-1.4.2-20.el4.92.at
openafs-1.4.2-20.el4.92.at
openafs-kmdl-2.6.18-1.2747.el5-1.4.2-20.el4.92.at

I've run the same test against my FC6, except I've removed the atrpm 
openafs kernel module and using my own. It didnt lock up - so now I'm 
wondering if it is an atrpm kernel module problem. I'll keep testing...

...deon