[OpenAFS-devel] problems with current cvs on linux - oops's, possibly related to 1.32-1.34 changes in osi_vnodeops.c

Derek Atkins warlord@MIT.EDU
14 Mar 2002 14:19:56 -0500


In fact, that would exactly explain the single-step that just
happened!  I set two breakpoints, one in
afs_linux_dentry_revaliadate() and one at line 821, just after the
call to Check_AtSys.  Your assessment would explain why it jumped
from the head of the function to the free():

Breakpoint 1, afs_linux_dentry_revalidate (dp=0xc20cc1e0, flags=0)
    at ../afs/osi_vnodeops.c:800
800         cred_t *credp = crref();
(gdb) 

Continuing.
Breakpoint 2, osi_FreeLargeSpace (adata=0xc1230000)
    at ../afs/afs_osi_alloc.c:71
71          AFS_STATCNT(osi_FreeLargeSpace);
(gdb)

This patch (suggested by you) does fix the problem.

-derek

Index: src/afs/LINUX/osi_vnodeops.c
===================================================================
RCS file: /cvs/openafs/src/afs/LINUX/osi_vnodeops.c,v
retrieving revision 1.34
diff -u -r1.34 osi_vnodeops.c
--- src/afs/LINUX/osi_vnodeops.c	2002/03/10 19:23:38	1.34
+++ src/afs/LINUX/osi_vnodeops.c	2002/03/14 19:15:53
@@ -807,6 +807,8 @@
 
     AFS_GLOCK();
 
+    sysState.allocked = 0;
+
     /* If it's a negative dentry, then there's nothing to do. */
     if (!vcp || !parentvcp)
         goto done;


"Neulinger, Nathan" <nneul@umr.edu> writes:

> Well, adding a 
> 
> sysState.allocked = 0;
> 
> at the top of the revalidate routine makes the oopses stop for me. 
> 
> -- Nathan
> 
> ------------------------------------------------------------
> Nathan Neulinger                       EMail:  nneul@umr.edu
> University of Missouri - Rolla         Phone: (573) 341-4841
> Computing Services                       Fax: (573) 341-4216
> 
> 
> > -----Original Message-----
> > From: Neulinger, Nathan 
> > Sent: Thursday, March 14, 2002 12:59 PM
> > To: openafs-devel@openafs.org
> > Subject: RE: [OpenAFS-devel] problems with current cvs on 
> > linux - oops's, possibly related to 1.32-1.34 changes in 
> > osi_vnodeops.c
> > 
> > 
> > Are local vars guaranteed to be initialized to null?
> > 
> > That revalidate routine has:
> > 
> > 	sysname_info sysState;
> > 
> > if (...)	
> > 	goto done;
> > 
> > ... set sysState in Check_AtSys
> > 
> > done:
> > 	free if sysstate.allocked is 1.
> > 
> > -- Nathan
> > 
> > ------------------------------------------------------------
> > Nathan Neulinger                       EMail:  nneul@umr.edu
> > University of Missouri - Rolla         Phone: (573) 341-4841
> > Computing Services                       Fax: (573) 341-4216
> > 
> > 
> > > -----Original Message-----
> > > From: Neulinger, Nathan 
> > > Sent: Thursday, March 14, 2002 12:53 PM
> > > To: 'Derek Atkins'
> > > Cc: openafs-devel@openafs.org; chas@cmf.nrl.navy.mil; Ted Anderson
> > > Subject: RE: [OpenAFS-devel] problems with current cvs on 
> > > linux - oops's, possibly related to 1.32-1.34 changes in 
> > > osi_vnodeops.c
> > > 
> > > 
> > > I wonder if the allocked entry of the sysname state isn't 
> > > getting set back to zero somewhere.
> > > 
> > > -- Nathan
> > > 
> > > ------------------------------------------------------------
> > > Nathan Neulinger                       EMail:  nneul@umr.edu
> > > University of Missouri - Rolla         Phone: (573) 341-4841
> > > Computing Services                       Fax: (573) 341-4216
> > > 
> > > 
> > > > -----Original Message-----
> > > > From: Derek Atkins [mailto:warlord@MIT.EDU] 
> > > > Sent: Thursday, March 14, 2002 12:44 PM
> > > > To: Neulinger, Nathan
> > > > Cc: openafs-devel@openafs.org; chas@cmf.nrl.navy.mil; Ted Anderson
> > > > Subject: Re: [OpenAFS-devel] problems with current cvs on 
> > > > linux - oops's, possibly related to 1.32-1.34 changes in 
> > > > osi_vnodeops.c
> > > > 
> > > > 
> > > > In a closer look there is DEFINITELY a double-free going 
> > on.  I just
> > > > caught the following while breakpoints are set in both
> > > > osi_AllocLargeSpace() and osi_FreeLargeSpace():
> > > > 
> > > > Breakpoint 2, osi_FreeLargeSpace (adata=0xc77a9000)
> > > >     at ../afs/afs_osi_alloc.c:71
> > > > 71          AFS_STATCNT(osi_FreeLargeSpace);
> > > > (gdb) 
> > > > Continuing.
> > > > 
> > > > Breakpoint 2, osi_FreeLargeSpace (adata=0xc4170000)
> > > >     at ../afs/afs_osi_alloc.c:71
> > > > 71          AFS_STATCNT(osi_FreeLargeSpace);
> > > > (gdb) 
> > > > Continuing.
> > > > 
> > > > Breakpoint 2, osi_FreeLargeSpace (adata=0xc4170000)
> > > >     at ../afs/afs_osi_alloc.c:71
> > > > 71          AFS_STATCNT(osi_FreeLargeSpace);
> > > > 
> > > > 
> > > > Notice that it's trying to free 0xc4170000 twice?  This 
> > last free()
> > > > is coming from:
> > > > 
> > > > (gdb) where
> > > > #0  osi_FreeLargeSpace (adata=0xc4170000) at 
> > > ../afs/afs_osi_alloc.c:71
> > > > #1  0xc8896a5d in afs_linux_dentry_revalidate 
> > > (dp=0xc3823260, flags=0)
> > > >     at ../afs/osi_vnodeops.c:847
> > > > #2  0xc01420fd in cached_lookup (parent=0xc662d7c0, 
> > > > name=0xc5349f98, flags=0)
> > > >     at namei.c:249
> > > > 
> > > > Unfortunately I didn't get a backtrace on the pentultimate 
> > > > free().  I'll
> > > > keep working on it.  But this is definitely the cause of the 
> > > > problem --
> > > > the same packet is getting onto the freelist twice.
> > > > 
> > > > -derek
> > > > 
> > > > "Neulinger, Nathan" <nneul@umr.edu> writes:
> > > > 
> > > > > Derek was able to trace down the reason for the fault 
> > > witn kgdb. I'd
> > > > > guess it likely has something to do with these recent changes to
> > > > > osi_vnodeops.c. I'll take a closer look, but I'm not 
> > > really familiar
> > > > > with what's going on in the code here. 
> > > > > 
> > > > > -- Nathan
> > > > > 
> > > > > ------------------------------------------------------------
> > > > > Nathan Neulinger                       EMail:  nneul@umr.edu
> > > > > University of Missouri - Rolla         Phone: (573) 341-4841
> > > > > Computing Services                       Fax: (573) 341-4216
> > > > > 
> > > > > 
> > > > > -----Original Message-----
> > > > > From: Neulinger, Nathan 
> > > > > Sent: Thursday, March 14, 2002 12:29 PM
> > > > > To: 'Derek Atkins'
> > > > > Subject: RE: Have you had a succesful build+use with 
> > > > current openafs?
> > > > > 
> > > > > 
> > > > > FYI Looks like it was probably introduced in 1.33 of 
> > > osi_vnodeops.c.
> > > > > There were a bunch of changes for revalidate_dnode.
> > > > > 
> > > > > -- Nathan
> > > > > 
> > > > > ------------------------------------------------------------
> > > > > Nathan Neulinger                       EMail:  nneul@umr.edu
> > > > > University of Missouri - Rolla         Phone: (573) 341-4841
> > > > > Computing Services                       Fax: (573) 341-4216
> > > > > 
> > > > > 
> > > > > > -----Original Message-----
> > > > > > From: Derek Atkins [mailto:warlord@MIT.EDU] 
> > > > > > Sent: Thursday, March 14, 2002 12:23 PM
> > > > > > To: Neulinger, Nathan
> > > > > > Subject: Re: Have you had a succesful build+use with 
> > > > current openafs?
> > > > > > 
> > > > > > 
> > > > > > Yep.  kgdb.sf.net.  Requires two systems with a "serial" line
> > > > > > between them.  In my case I'm using vmware :)
> > > > > > 
> > > > > > We should definitely report this to openafs-devel!  
> > I'll look a
> > > > > > bit more but I've only got about 30 minutes more I 
> > can sink into
> > > > > > this today.
> > > > > > 
> > > > > > -derek
> > > > > > 
> > > > > > "Neulinger, Nathan" <nneul@umr.edu> writes:
> > > > > > 
> > > > > > > Goody! :)
> > > > > > > 
> > > > > > > Thanks. That's reassuring, nothing like having a bug 
> > > > like this screw
> > > > > > > with your head when you think that you've made a whole 
> > > > > > bunch of changes
> > > > > > > that should not have any affect on code behavior.
> > > > > > > 
> > > > > > > Are you using the kernel debugger patches? I definately 
> > > > > > should dig into
> > > > > > > that some time. 
> > > > > > > 
> > > > > > > -- Nathan
> > > > > > > 
> > > > > > > ------------------------------------------------------------
> > > > > > > Nathan Neulinger                       EMail:  nneul@umr.edu
> > > > > > > University of Missouri - Rolla         Phone: (573) 341-4841
> > > > > > > Computing Services                       Fax: (573) 341-4216
> > > > > > > 
> > > > > > > 
> > > > > > > > -----Original Message-----
> > > > > > > > From: Derek Atkins [mailto:warlord@MIT.EDU] 
> > > > > > > > Sent: Thursday, March 14, 2002 12:17 PM
> > > > > > > > To: Neulinger, Nathan
> > > > > > > > Subject: Re: Have you had a succesful build+use with 
> > > > > > current openafs?
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Here is the stack trace of what's going on.  I have 
> > > > no idea why
> > > > > > > > freePacketList is being set to '1'.  I'll keep 
> > > > looking, but this
> > > > > > > > is clearly not "just you" :)
> > > > > > > > 
> > > > > > > > -derek
> > > > > > > > 
> > > > > > > > Program received signal SIGSEGV, Segmentation fault.
> > > > > > > > 0xc8863031 in osi_AllocLargeSpace (size=720) at 
> > > > > > > > ../afs/afs_osi_alloc.c:201
> > > > > > > > 201         if ( tp ) freePacketList = tp->next;
> > > > > > > > (gdb) where
> > > > > > > > #0  0xc8863031 in osi_AllocLargeSpace (size=720) at 
> > > > > > > > ../afs/afs_osi_alloc.c:201
> > > > > > > > #1  0xc8872a91 in afs_DoBulkStat (adp=0xc8936658, 
> > > > dirCookie=480, 
> > > > > > > >     areqp=0xc5347e68) at ../afs/afs_vnop_lookup.c:426
> > > > > > > > #2  0xc8874c25 in afs_lookup (adp=0xc8936658, 
> > > > > > aname=0xc13ded80 "CVS", 
> > > > > > > >     avcp=0xc5347ec4, acred=0xc412a000) at 
> > > > > > > > ../afs/afs_vnop_lookup.c:1190
> > > > > > > > #3  0xc8896c14 in afs_linux_lookup (dip=0xc8936658, 
> > > > dp=0xc13ded20)
> > > > > > > >     at ../afs/osi_vnodeops.c:993
> > > > > > > > #4  0xc0142175 in real_lookup (parent=0xc79210c0, 
> > > > > > > > name=0xc5347f4c, flags=0)
> > > > > > > >     at namei.c:284
> > > > > > > > #5  0xc0142936 in path_walk (name=0xc76bf011 "", 
> > > > > > > > nd=0xc5347f98) at namei.c:564
> > > > > > > > #6  0xc014313a in __user_walk (name=0xbfffb820 
> > > > > > > > "../../src/doc/CVS", flags=9, 
> > > > > > > >     nd=0xc5347f98) at namei.c:805
> > > > > > > > #7  0xc013fcf6 in sys_stat64 (filename=0xbfffb820 
> > > > > > > > "../../src/doc/CVS", 
> > > > > > > >     statbuf=0xbfff96b0, flags=1075171668) at stat.c:337
> > > > > > > > #8  0xc0106fcb in system_call () at af_packet.c:1879
> > > > > > > > (gdb) p tp
> > > > > > > > $1 = (struct osi_packet *) 0x1
> > > > > > > > (gdb) p freePacketList
> > > > > > > > $2 = (struct osi_packet *) 0x1
> > > > > > > > 
> > > > > > > > -- 
> > > > > > > >        Derek Atkins, SB '93 MIT EE, SM '95 MIT Media 
> > > > Laboratory
> > > > > > > >        Member, MIT Student Information Processing 
> > > > Board  (SIPB)
> > > > > > > >        URL: http://web.mit.edu/warlord/    PP-ASEL-IA 
> > > >     N1NWH
> > > > > > > >        warlord@MIT.EDU                        PGP key 
> > > > available
> > > > > > > > 
> > > > > > 
> > > > > > -- 
> > > > > >        Derek Atkins, SB '93 MIT EE, SM '95 MIT Media 
> > Laboratory
> > > > > >        Member, MIT Student Information Processing 
> > Board  (SIPB)
> > > > > >        URL: http://web.mit.edu/warlord/    PP-ASEL-IA 
> >     N1NWH
> > > > > >        warlord@MIT.EDU                        PGP key 
> > available
> > > > > > 
> > > > > _______________________________________________
> > > > > OpenAFS-devel mailing list
> > > > > OpenAFS-devel@openafs.org
> > > > > https://lists.openafs.org/mailman/listinfo/openafs-devel
> > > > 
> > > > -- 
> > > >        Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
> > > >        Member, MIT Student Information Processing Board  (SIPB)
> > > >        URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
> > > >        warlord@MIT.EDU                        PGP key available
> > > > 
> > > 
> > _______________________________________________
> > OpenAFS-devel mailing list
> > OpenAFS-devel@openafs.org
> > https://lists.openafs.org/mailman/listinfo/openafs-devel
> > 
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel

-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       warlord@MIT.EDU                        PGP key available