[OpenAFS-devel] error report with hardly any useful information, an assert in file server code

Nathan Neulinger nneul@umr.edu
12 Mar 2003 19:24:04 -0600


Ok, this wasn't it... Looks like gcc is generating bogus code for long
long comparisons. I was misreading stds.h.

Is anyone else seeing this with a current cvs build? Build has been
working fine with gcc-3.2.1 until this. 

-- Nathan

On Wed, 2003-03-12 at 19:04, Nathan Neulinger wrote:
> Don't know if this is a gcc problem, or what, but it looks like
> comparisons between afs_offs_t's are not working properly:
> 
> If I copy the contents of nbytes and size to temporary integers, the
> comparisons work just fine in the read loop. But it looks like the
> results are reversed, 
> 
> when nbytes=2048 and size=8192, the generated code thinks that
> size<nbytes. But copying size and nbytes to temporary unsigned long or
> int vars it thinks that size > nbytes. 
> 
> What's odd is that a afs_offs_t is supposed to be an afs_uint32, which
> should be an unsigned long. 
> 
> Now here is where it gets really weird. If I copy nbytes and size to
> afs_uint32 temporary vars, the comparison works fine as well... 
> 
> AHHH! I see now. afs_offs_t is a non-atomic (not the right word) data
> type when on a client that supports 64 bit. 
> 
> The code should be changed to use the 64bit compare functions if
> AFS_64BIT_CLIENT is defined. I'll try to come up with a patch.
> 
> The current cvs head code should probably be considered unsafe until all
> of the recent largefile changes have been checked over for this sort of
> problem. 
> 
> -- Nathan
> 
> 
> On Wed, 2003-03-12 at 17:59, Nathan Neulinger wrote:
> > Looks like there is something going wacko with the Log() call, those
> > numbers aren't real, as least not the 1362... one. Printing out those
> > numbers individually appears to work ok. Might be a integer width issue?
> > 
> > In any case, looks like in the empty-volume case, iod_Read is returning
> > 2053 bytes when a read of 2048 bytes was requested. Still digging...
> > 
> > On Wed, 2003-03-12 at 16:26, Neulinger, Nathan wrote:
> > > Looks to me like the nbytes value is way out of wack. I get this on the
> > > empty volume:
> > > 
> > > Mar 12 16:21:34 afs-fs18 volserver[13119]: 1 Volser: CreateVolume:
> > > volume 538199060 (users.nneul.test1) created 
> > > Mar 12 16:21:34 afs-fs18 volserver[13119]: 1 Volser: WriteFile: Error
> > > reading dump file 1 size=2048 nbytes=136295336 (2048 of 136295336);
> > > restore aborted 
> > > Mar 12 16:21:34 afs-fs18 volserver[13119]: 1 Volser: ReadVnodes: IDEC
> > > inode 1 
> > > 
> > > -- Nathan
> > > 
> > > ------------------------------------------------------------
> > > Nathan Neulinger                       EMail:  nneul@umr.edu
> > > University of Missouri - Rolla         Phone: (573) 341-4841
> > > Computing Services                       Fax: (573) 341-4216
> > > 
> > > 
> > > > -----Original Message-----
> > > > From: Neulinger, Nathan 
> > > > Sent: Wednesday, March 12, 2003 4:22 PM
> > > > To: openafs-devel@openafs.org
> > > > Subject: RE: [OpenAFS-devel] error report with hardly any 
> > > > useful information, an assert in file server code
> > > > 
> > > > 
> > > > I have a reproducible error condition with the current cvs volserver
> > > > code. I can move sw.krb5src between my other file servers (all running
> > > > 2002-09-18 build) without any problem, but as soon as I try 
> > > > to move the
> > > > volume to the file server running current code, it fails:
> > > > 
> > > > Mar 12 16:11:19 afs-fs18 volserver[13119]: admin is executing
> > > > CreateVolume 'sw.krb5src' 
> > > > Mar 12 16:11:19 afs-fs18 volserver[13119]: 1 Volser: CreateVolume:
> > > > volume 536928507 (sw.krb5src) created 
> > > > Mar 12 16:12:12 afs-fs18 volserver[13119]: 1 Volser: WriteFile: Error
> > > > reading dump file 1 size=2048 nbytes=136294776 (-482088960 of
> > > > 136294775); restore aborted 
> > > > Mar 12 16:12:12 afs-fs18 volserver[13119]: 1 Volser: ReadVnodes: IDEC
> > > > inode 1 
> > > > 
> > > > Note - moving volumes OFF of that server works fine. 
> > > > 
> > > > I'm going to try and get some more info, but since this is very
> > > > reproducible, I figured it would be a good thing to investigate.
> > > > 
> > > > Note - it hapens with ANY volume size, including a freshly created
> > > > volume that is completely empty.
> > > > 
> > > > -- Nathan
> > > > 
> > > > ------------------------------------------------------------
> > > > Nathan Neulinger                       EMail:  nneul@umr.edu
> > > > University of Missouri - Rolla         Phone: (573) 341-4841
> > > > Computing Services                       Fax: (573) 341-4216
> > > > 
> > > > 
> > > > > -----Original Message-----
> > > > > From: Neulinger, Nathan 
> > > > > Sent: Wednesday, March 12, 2003 3:31 PM
> > > > > To: openafs-devel@openafs.org
> > > > > Subject: [OpenAFS-devel] error report with hardly any useful 
> > > > > information, an assert in file server code
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > Just upgraded a server to current snapshot, and got this assert. I
> > > > > believe it was in the process of trying to do a dump or a 
> > > > > clone. I'm not
> > > > > positive.
> > > > > 
> > > > > Assertion failed! file 
> > > > > /umr/s/openafs/openafs/src/viced/afsfileprocs.c,
> > > > > line 6725.
> > > > > 
> > > > > That looks to be something to do with large file support. 
> > > > > 
> > > > > I also got a couple of these, but have not been able to reproduce.
> > > > > 
> > > > > Mar 12 15:18:06 afs-fs8 volserver[20341]: 1 Volser: Clone: Cloning
> > > > > volume 537102738 to new volume 538198043 
> > > > > Mar 12 15:18:11 afs-fs8 volserver[20341]: 1 Volser: WriteFile: Error
> > > > > reading dump file 1 size=354304 nbytes=136179608 (-702464 of 
> > > > > 136179607);
> > > > > restore aborted 
> > > > > Mar 12 15:18:11 afs-fs8 volserver[20341]: 1 Volser: ReadVnodes: IDEC
> > > > > inode 67108865
> > > > > 
> > > > > I need to check and make sure, but it looks like I'm seeing 
> > > > a bunch of
> > > > > warnings about idle transactions that I wasn't seeing 
> > > > before. They are
> > > > > all against clone volumes that were being used for dumps.
> > > > > 
> > > > > Just got another one of those assertions. 
> > > > > 
> > > > > Basically, current cvs file/volserver code appears to have 
> > > > > some issues.
> > > > > I'm going to try getting some more details on another 
> > > > server with some
> > > > > test volumes... 
> > > > > 
> > > > > -- Nathan
> > > > > 
> > > > > ------------------------------------------------------------
> > > > > Nathan Neulinger                       EMail:  nneul@umr.edu
> > > > > University of Missouri - Rolla         Phone: (573) 341-4841
> > > > > Computing Services                       Fax: (573) 341-4216
> > > > > _______________________________________________
> > > > > OpenAFS-devel mailing list
> > > > > OpenAFS-devel@openafs.org
> > > > > https://lists.openafs.org/mailman/listinfo/openafs-devel
> > > > > 
> > > > _______________________________________________
> > > > OpenAFS-devel mailing list
> > > > OpenAFS-devel@openafs.org
> > > > https://lists.openafs.org/mailman/listinfo/openafs-devel
> > > > 
> > > _______________________________________________
> > > OpenAFS-devel mailing list
> > > OpenAFS-devel@openafs.org
> > > https://lists.openafs.org/mailman/listinfo/openafs-devel
-- 

------------------------------------------------------------
Nathan Neulinger                       EMail:  nneul@umr.edu
University of Missouri - Rolla         Phone: (573) 341-4841
Computing Services                       Fax: (573) 341-4216