[OpenAFS-devel] OpenAFS 1.4.0 rc3 crashes on Linux 2.6
Andy Lutomirski
amluto@hotmail.com
Fri, 21 Oct 2005 01:09:37 +0000
>From: "chas williams - CONTRACTOR" <chas@cmf.nrl.navy.mil>
>To: "Andy Lutomirski" <amluto@hotmail.com>
>CC: openafs-devel@openafs.org
>Subject: Re: [OpenAFS-devel] OpenAFS 1.4.0 rc3 crashes on Linux 2.6 Date:
>Wed, 12 Oct 2005 22:18:24 -0400
>
>In message <BAY106-F30200F494E413413103CB3C1870@phx.gbl>,"Andy Lutomirski"
>writes:
> >I frequently get the following crash. I can trigger it most of the time
>by
> >running 'ls /afs/ir' (which is a symlink to /afs/ir.stanford.edu) after
>not
> >using afs for some time.
>
>what afs options are you using? -dynroot? -fakestat? approximately how
>long is some time?
>
> >ls D ffff810019d0b440 0 3410 3361 3413
> >(NOTLB)
> >ffff810011eb7e38 0000000000000082 00000000005206a8 ffff81000db23d68
> > ffff810008856e50 ffff8100088560b0 ffff810008856e50
>ffff8100088562c8
> > 0000000000000000 ffffffff881992b0
> >Call Trace:<ffffffff881992b0>{:libafs:afs_linux_getattr+288}
> ><ffffffff80389db6>{__down+198}
> > <ffffffff8012d5a0>{default_wake_function+0}
> ><ffffffff8038b9e4>{__down_failed+53}
> > <ffffffff8018a250>{filldir+0}
> ><ffffffff8018a5e9>{.text.lock.readdir+5}
> > <ffffffff8018a3c2>{sys_getdents+130}
><ffffffff8010f389>{error_exit+0}
> > <ffffffff8010ea96>{system_call+126}
> >
> >Any ideas? Any tests I can run to help debug this?
>
>you dont mention the machine platform, but i am going to guess x86_64?
>it would be helpful if you could do something like:
>
>gdb /wherever/the/afs/module/is.ko
>(gdb) info line *afs_linux_getattr+0x288
>
>and send along the output. thanks.
Yes, this is x86_64.
I just re-triggered it with 1.4.0-rc6 (both userspace and kernel, although
the rc3 userspace part had been running with the rc6 libafs.ko for awhile
without an intervening reboot).
Here's the OOPS:
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at "/var/tmp/portage/openafs-kernel-1.4.0_rc6/work/op:131
invalid operand: 0000 [1] PREEMPT
CPU 0
Modules linked in: libafs ipt_conntrack iptable_nat ipt_REJECT ipt_state
ip_conntrack ipt_multiport iptable_filter ip_tables snd_pcm_oss
snd_mixer_oss snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
snd_cs4281 snd_opl3_lib snd_hwdep snd_via82xx snd_ac97_codec snd_pcm
snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd
soundcore xfs raid5 xor raid0
Pid: 1007, comm: ls Tainted: P 2.6.13-gentoo-r3
RIP: 0010:[<ffffffff8818a250>] <ffffffff8818a250>{:libafs:osi_Panic+0}
RSP: 0000:ffff810008575da0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: fffffffffffffffb RCX: 0000000000000000
RDX: ffff81001871af10 RSI: 00000000000a49c6 RDI: ffffffff881aacda
RBP: ffff81001f05ac00 R08: 0000000000000000 R09: ffff810008402d80
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8100148c7000
R13: ffffffff881c30c0 R14: 00000000000a49c6 R15: 00000000000a49c6
FS: 00002aaaaaadbb00(0000) GS:ffffffff80502800(0000) knlGS:0000000040105940
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000051db78 CR3: 00000000047d7000 CR4: 00000000000006e0
Process ls (pid: 1007, threadinfo ffff810008574000, task ffff810010ca20b0)
Stack: ffffffff881953a8 0000000000000000 0000000000000000 ffff81001ba833f0
0000000000000000 ffffc20000797d08 000000000051cb40 000000000051cb20
ffffffff88145690 0000000000000001
Call Trace:<ffffffff881953a8>{:libafs:osi_UFSOpen+440}
<ffffffff88145690>{:libafs:DRead+848}
<ffffffff881535bd>{:libafs:afs_dir_GetBlob+13}
<ffffffff8817043f>{:libafs:BlobScan+31}
<ffffffff881988c1>{:libafs:afs_linux_readdir+1041}
<ffffffff8011f869>{do_page_fault+1113}
<ffffffff8018a250>{filldir+0} <ffffffff8018a0e6>{vfs_readdir+118}
<ffffffff8018a3c2>{sys_getdents+130} <ffffffff8010f389>{error_exit+0}
<ffffffff8010ea96>{system_call+126}
Code: 0f 0b a3 68 d6 1a 88 ff ff ff ff c2 83 00 c3 90 48 83 fe 01
RIP <ffffffff8818a250>{:libafs:osi_Panic+0} RSP <ffff810008575da0>
It looks like I don't have symbols for libafs.ko. Grr. I've rebuilt with
symbols.
Take these gdb outputs with a grain of salt, since they're from the wrong
binary. Hopefully code generation is deterministic enough...
Line 230 of
"/var/tmp/portage/openafs-kernel-1.4.0_rc6/work/openafs-1.4.0-rc6/src/libafs/MODLOAD-2.6.13-gentoo-r3-SP/osi_vnodeops.c"
starts at address 0x558c1 <afs_linux_readdir+1041> and ends at 0x558c3
<afs_linux_readdir+1043>.
225 */
226 code = 0;
227 offset = (int) fp->f_pos;
228 while (1) {
229 dirpos = BlobScan(tdc, offset);
230 if (!dirpos)
231 break;
232
233 de = afs_dir_GetBlob(tdc, dirpos);
234 if (!de)
Line 82 of
"/var/tmp/portage/openafs-kernel-1.4.0_rc6/work/openafs-1.4.0-rc6/src/libafs/MODLOAD-2.6.13-gentoo-r3-SP/afs_vnop_readdir.c"
starts at address 0x2d43f <BlobScan+31> and ends at 0x2d442 <BlobScan+34>.
77 AFS_STATCNT(BlobScan);
78 /* advance ablob over free and header blobs */
79 while (1) {
80 pageBlob = ablob & ~(EPP - 1); /* base blob in same page */
81 tpe = (struct PageHeader *)afs_dir_GetBlob(afile, pageBlob);
82 if (!tpe)
83 return 0; /* we've past the end */
84 relativeBlob = ablob - pageBlob; /* relative to
page's first blob */
85 /* first watch for headers */
86 if (pageBlob == 0) { /* first dir page has extra-big
header */
******* This one is even less likely correct since it's an rc3 oops with rc6
symbols. Sorry.
gdb) info line *afs_linux_getattr+288
Line 672 of
"/var/tmp/portage/openafs-kernel-1.4.0_rc6/work/openafs-1.4.0-rc6/src/libafs/MODLOAD-2.6.13-gentoo-r3-SP/osi_vnodeops.c"
starts at address 0x565d0 <afs_linux_getattr+288> and ends at 0x56610
<afs_linux_dentry_revalidate>.
(gdb) list *afs_linux_getattr+288
0x565d0 is in afs_linux_getattr
(/var/tmp/portage/openafs-kernel-1.4.0_rc6/work/openafs-1.4.0-rc6/src/libafs/MODLOAD-2.6.13-gentoo-r3-SP/osi_vnodeops.c:672).
667 int err = afs_linux_revalidate(dentry);
668 if (!err) {
669 generic_fillattr(dentry->d_inode, stat);
670 }
671 return err;
672 }
673 #endif
674
675 /* Validate a dentry. Return 1 if unchanged, 0 if VFS layer should
re-evaluate.
676 * In kernels 2.2.10 and above, we are passed an additional flags
var which
I'll reboot and run with the new libafs with real symbols and email again if
this triggers.
Thanks,
Andy
_________________________________________________________________
Don’t just search. Find. Check out the new MSN Search!
http://search.msn.click-url.com/go/onm00200636ave/direct/01/