[OpenAFS-devel] RE: Yuck... largefile support really fouled things up bad...

Nathan Neulinger nneul@umr.edu
23 Mar 2003 08:50:40 -0600


On Sun, 2003-03-23 at 04:52, Hartmut Reuter wrote:
> My feeling is that you should enable large file support for the server 
> only on those platforms where it has been tested already.

I agree. And think that for a while, it should be disabled by default.
But since it should be completely possible to do the largefile support
on solaris native for example, no sense in designing the code to where
it couldn't handle that situation.

> We could use a similar technique as I introduced for the AFS-client:
> There afs_size_t and afs_offs_t are afs_int64/afs_uint64 if 
> AFS_64BIT_CLIENT is set, otherwise they are afs_int32/afs_uint32.
> 
> So you should replace afs_int64 at most places by say afs_length_t which 
> could be either afs_int64 or afs_int32 depending on whether or not 
> AFS_LARGEFILE_ENV is set.
> 
> The struct afs_int64 is only needed on legacy platforms which neither 
> have native int64 nor have long long. It is needed for the xdr layer 
> which then breakes afs_int64 down to two 32 bit fields. Of course such 
> platforms never will have large file support.

I definately agree. The point of using the structure (at least
temporarily) is to force you to make certain that the data type changes
are accompanied by the appropriate logic changes. i.e. While the
development is taking place, have it be the structure. That forces you
to update your logic and makes absolutely certain you can't miss any
places where you might be doing scalar ops against the length/size/etc. 

Once that is done, you can just switch it back to the real int64 and the
corresponding macros on platforms that can support it.

Related to this, I was thinking that it would be very useful to have a
openafs specific sprintf/fprintf in the code, so that we could do our
logging/etc. without having to worry about the data types nearly as
much. Especially for things like printing inodes/int64's/etc. Extending
it slightly with format specifiers for complex data types:

afs_printf("inode=%inode length=%afs_size_t ip=%ip\n"...)

might enable us to eliminate alot of crud in the code that goes through
contortions of ifdefs and temporary copies to log certain things. 

I agree, we'll probably never support large files on platforms that
don't support real int64's, but I figure, unless we have to, why have
that concern scattered throughout the code more than necessary - have
most of the code operate without being concerned about largefile
support, and only fail or check when it's absolutely required. 

-- Nathan

> Hartmut
> 
> Neulinger, Nathan schrieb:
> > Current situation is that there are a couple severe problems.
> > 
> > The current cvs trunk code is badly broke, independent of the issues
> > related to 1-4 below. The file server is completely unusable on linux
> > right now. With the last patch I sent (and Derrick committed), the
> > volserver is usable. The problem is that the changes for large file
> > support made some fundamental data structure changes, and those changes
> > were not followed through in way too many places, resulting in code that
> > sortof looks like it will work, and maybe does work properly on some
> > platforms, but is completely broke on others. Basically, all over the
> > place, things changed into afs_int64's, and the code was not changed to
> > be aware of the fact that it might not be passing around afs_int32's any
> > more.
> > 
> > I've done some limited testing changing the afs_int64 data types to be a
> > structure instead of a real integer. Based on the HUGE number of compile
> > problems - cascading as well, ever fix shows up more - the number of
> > places in the code that this isn't safe for are quite dramatic.
> > 
> > My current recommendation is that the largefile support be backed out,
> > and the following done:
> > 
> > 	Make the afs_int64 type always a structure until the code is
> > stabilized. For that matter, I'm not sure leaving it as a structure will
> > really cause that bad of a performance hit since the structure can be a
> > single element that is the real 64bit int in most cases.
> > 
> > 	Start changing the underlying server code to be largefile safe -
> > i.e. basically make ALL of the sizes/offsets/inodes/etc. be stored in
> > afs_int64/afs_uint64's. This is the vast majority of the work. Note -
> > LEAVE the server as 32 bit limited still, but get all of the code to
> > where it uses the 64bit types for it's tracking. Should be able to do
> > this in stages - start with the inodes. 
> > 
> > 	Add back in the code to start dealing with 64bit files to the
> > file server. 
> > 
> > I believe that:
> > 
> > 	a. Any platform should be capable of dealing with the fileserver
> > code that handles 64 bit files, but on platforms not capable of doing 64
> > bit ops - the code that actually depends on it should cause failures.
> > 
> > 	b. Structures for dealing with the large files should be made so
> > that they will be functional with either a real largefile inode, or with
> > namei largefile inodes. 
> > 
> > 	c. Requiring a fsconv_ or similar to convert to a server with
> > large file support should not be a tragedy. AFS was designed to make
> > moving volumes easy, and adding largefile support is such a dramatic new
> > feature that this doesn't seem like a big deal. Easy way out is to clear
> > a server at a time, and just move the volumes. Providing a tool to
> > convert the filesystem on upgrade (which I believe should only be
> > necessary for namei installations) should be the way to go for allowing
> > an in-place upgrade.
> > 
> > 	d. The large file support should probably check the current on
> > disk contents in the case of a namei server for a flag or trigger. If
> > not present and data is found, should error out indicating that the
> > on-disk contents are not largefile aware. 
> > 
> > 	e. It might take some extra code, but I think it should be
> > possible to have the VN_GET_INO() and friends be able to check to see if
> > the current partition is largefile-aware/capable, and process the
> > contents of the vnode accordingly. 
> > 
> > 	f. We have an easy way out - SMALL64VNODEMAGIC
> > LARGE64VNODEMAGIC. If the vnode has the old magic, it is not 64bit
> > aware, and handle the content accordingly. This should make (e) fairly
> > easy to implement. This should also allow a transparent in-place upgrade
> > that will remain mostly backwards compatible. If the write taking place
> > is going to require largefile support, update the vnode to use 64bit
> > magic, otherwise mark it using the 32bit. Doing that will allow you to
> > modify content in-place. 
> > 
> > I will be trying to put together a patch for Derrick to back out the
> > current code to where the trunk fileserver is at least usable. Seems
> > like no matter what we do, it'll be alot easier to apply stuff a small
> > chunk at a time starting from a working base. Right now, things are in a
> > bad enough state that I would hate to see how long it will take to get
> > us back to a fully working system. 
> > 
> > -- Nathan
> > 
> > ------------------------------------------------------------
> > Nathan Neulinger                       EMail:  nneul@umr.edu
> > University of Missouri - Rolla         Phone: (573) 341-4841
> > Computing Services                       Fax: (573) 341-4216
> > 
> > 
> > 
> >>-----Original Message-----
> >>From: Hartmut Reuter [mailto:reuter@rzg.mpg.de] 
> >>Sent: Thursday, March 20, 2003 3:27 AM
> >>To: R. Lindsay Todd
> >>Cc: Neulinger, Nathan; openafs-devel@openafs.org
> >>Subject: Re: Yuck... largefile support really fouled things up bad...
> >>
> >>
> >>
> >>Sorry for my late reply, but I broke my leg and was in the 
> >>hospital for 
> >>some days.
> >>
> >>The point is:
> >>1.) You can have large file support only with the namei-interface.
> >>2.) the namei-interface needs AFS_64BIT_IOPS_ENV to be set 
> >>because the 
> >>inode number used in namei_ops.c is 64 bit long.
> >>3.) the field vn_ino_hi used to store redundantly the same 
> >>contents as 
> >>the field uniqifier. Therefore I thought it would be better to modify 
> >>VNDISK_GET_INO and VNDISK_SET_INO in order to use immediatly 
> >>uniquifier 
> >>and to use vn_ino_hi for the high order 32 bits of the file length.
> >>4.) You never will be able to change an old fileserver to a 
> >>large file 
> >>supporting fileserver by just installing the new binaries. 
> >>You will have 
> >>to move the volumes to the new server. During the move 
> >>process the old 
> >>contents ov vn_ino_hi is not transfered and the field is cleared.
> >>
> >>Hartmut
> >>
> >>
> >>
> >>R. Lindsay Todd schrieb:
> >>
> >>>I can live with this change, and certainly agree that it is 
> >>
> >>better to 
> >>
> >>>make this change now than later.
> >>>
> >>>/Lindsay
> >>>
> >>>Neulinger, Nathan wrote:
> >>>
> >>>
> >>>>How do y'all feel about maintaining on-disk compatability 
> >>
> >>with adding
> >>
> >>>>the largefile fileserver support to openafs? There are 
> >>
> >>some issues with
> >>
> >>>>the current trunk code that make assumptions that are not 
> >>
> >>correct (such
> >>
> >>>>as any 64bit_env build MUST HAVE largefile_env defined, or 
> >>
> >>it doesn't
> >>
> >>>>work right). Also assumes that we will never be able to do 
> >>
> >>both 64bit
> >>
> >>>>iops and large files in the same binary.
> >>>>
> >>>>I'd like to use to reserved6 field in the vnode disk 
> >>
> >>structure to add a
> >>
> >>>>length_hi, and start with the attached patch, plus other 
> >>
> >>sanity checking
> >>
> >>>>code will need added. Since this code hasn't ever been in 
> >>
> >>a real release
> >>
> >>>>of openafs, now is the time to decide how to do it.
> >>>>Current implementation forces some dead ends, seems like using the
> >>>>reserved slot is the way to go.
> >>>>Derrick wanted me to talk to both of you first before he started
> >>>>applying these changes. He did apply one I sent that at 
> >>
> >>least gets rid
> >>
> >>>>of the failure I talked about on -devel yesterday.
> >>>>Do either of you object to using reserved6 as length_hi, 
> >>
> >>and eliminating
> >>
> >>>>the field re-use that is currently in place?
> >>>>
> >>>>-- Nathan
> >>>>
> >>>>------------------------------------------------------------
> >>>>Nathan Neulinger                       EMail:  nneul@umr.edu
> >>>>University of Missouri - Rolla         Phone: (573) 341-4841
> >>>>Computing Services                       Fax: (573) 341-4216
> >>>>
> >>>>
> >>>> 
> >>>>
> >>>>
> >>>>>-----Original Message-----
> >>>>>From: Derrick J Brashear [mailto:shadow@dementia.org] 
> >>
> >>Sent: Thursday, 
> >>
> >>>>>March 13, 2003 12:54 PM
> >>>>>To: Neulinger, Nathan
> >>>>>Subject: RE: Yuck... largefile support really fouled 
> >>
> >>things up bad...
> >>
> >>>>>On Thu, 13 Mar 2003, Neulinger, Nathan wrote:
> >>>>>
> >>>>>  
> >>>>>
> >>>>>
> >>>>>>If largefile_env is defined, it forces namei.
> >>>>>>    
> >>>>>
> >>>>>that doesn't conflict with my statement.
> >>>>>
> >>>>>  
> >>>>>
> >>>>>
> >>>>>>largefile_env is currently incompatible with 64bit_iops     
> >>>>>
> >>>>>(it's reusing
> >>>>>  
> >>>>>
> >>>>>
> >>>>>>vn_ino_hi, which iops uses), but the only reason it doesn't     
> >>>>>
> >>>>>clash/fail
> >>>>>  
> >>>>>
> >>>>>
> >>>>>>is that it is forcing namei.
> >>>>>>    
> >>>>>
> >>>>>ok, so talk to lindsay or hartmut and see if they care if 
> >>
> >>we switch.
> >>
> >>>>>  
> >>>>
> >>>>
> >>>> 
> >>>>
> >>>
> >>
> >>-- 
> >>-----------------------------------------------------------------
> >>Hartmut Reuter                           e-mail reuter@rzg.mpg.de
> >>					   phone +49-89-3299-1328
> >>RZG (Rechenzentrum Garching)               fax   +49-89-3299-1301
> >>Computing Center of the Max-Planck-Gesellschaft (MPG) and the
> >>Institut fuer Plasmaphysik (IPP)
> >>-----------------------------------------------------------------
> >>
> >>
> > 
> > _______________________________________________
> > OpenAFS-devel mailing list
> > OpenAFS-devel@openafs.org
> > https://lists.openafs.org/mailman/listinfo/openafs-devel
-- 

------------------------------------------------------------
Nathan Neulinger                       EMail:  nneul@umr.edu
University of Missouri - Rolla         Phone: (573) 341-4841
Computing Services                       Fax: (573) 341-4216