[OpenAFS-devel] RE: Yuck... largefile support really fouled things up bad...

Hartmut Reuter reuter@rzg.mpg.de
Sun, 23 Mar 2003 11:52:28 +0100


My feeling is that you should enable large file support for the server 
only on those platforms where it has been tested already.

We could use a similar technique as I introduced for the AFS-client:
There afs_size_t and afs_offs_t are afs_int64/afs_uint64 if 
AFS_64BIT_CLIENT is set, otherwise they are afs_int32/afs_uint32.

So you should replace afs_int64 at most places by say afs_length_t which 
could be either afs_int64 or afs_int32 depending on whether or not 
AFS_LARGEFILE_ENV is set.

The struct afs_int64 is only needed on legacy platforms which neither 
have native int64 nor have long long. It is needed for the xdr layer 
which then breakes afs_int64 down to two 32 bit fields. Of course such 
platforms never will have large file support.

Hartmut

Neulinger, Nathan schrieb:
> Current situation is that there are a couple severe problems.
> 
> The current cvs trunk code is badly broke, independent of the issues
> related to 1-4 below. The file server is completely unusable on linux
> right now. With the last patch I sent (and Derrick committed), the
> volserver is usable. The problem is that the changes for large file
> support made some fundamental data structure changes, and those changes
> were not followed through in way too many places, resulting in code that
> sortof looks like it will work, and maybe does work properly on some
> platforms, but is completely broke on others. Basically, all over the
> place, things changed into afs_int64's, and the code was not changed to
> be aware of the fact that it might not be passing around afs_int32's any
> more.
> 
> I've done some limited testing changing the afs_int64 data types to be a
> structure instead of a real integer. Based on the HUGE number of compile
> problems - cascading as well, ever fix shows up more - the number of
> places in the code that this isn't safe for are quite dramatic.
> 
> My current recommendation is that the largefile support be backed out,
> and the following done:
> 
> 	Make the afs_int64 type always a structure until the code is
> stabilized. For that matter, I'm not sure leaving it as a structure will
> really cause that bad of a performance hit since the structure can be a
> single element that is the real 64bit int in most cases.
> 
> 	Start changing the underlying server code to be largefile safe -
> i.e. basically make ALL of the sizes/offsets/inodes/etc. be stored in
> afs_int64/afs_uint64's. This is the vast majority of the work. Note -
> LEAVE the server as 32 bit limited still, but get all of the code to
> where it uses the 64bit types for it's tracking. Should be able to do
> this in stages - start with the inodes. 
> 
> 	Add back in the code to start dealing with 64bit files to the
> file server. 
> 
> I believe that:
> 
> 	a. Any platform should be capable of dealing with the fileserver
> code that handles 64 bit files, but on platforms not capable of doing 64
> bit ops - the code that actually depends on it should cause failures.
> 
> 	b. Structures for dealing with the large files should be made so
> that they will be functional with either a real largefile inode, or with
> namei largefile inodes. 
> 
> 	c. Requiring a fsconv_ or similar to convert to a server with
> large file support should not be a tragedy. AFS was designed to make
> moving volumes easy, and adding largefile support is such a dramatic new
> feature that this doesn't seem like a big deal. Easy way out is to clear
> a server at a time, and just move the volumes. Providing a tool to
> convert the filesystem on upgrade (which I believe should only be
> necessary for namei installations) should be the way to go for allowing
> an in-place upgrade.
> 
> 	d. The large file support should probably check the current on
> disk contents in the case of a namei server for a flag or trigger. If
> not present and data is found, should error out indicating that the
> on-disk contents are not largefile aware. 
> 
> 	e. It might take some extra code, but I think it should be
> possible to have the VN_GET_INO() and friends be able to check to see if
> the current partition is largefile-aware/capable, and process the
> contents of the vnode accordingly. 
> 
> 	f. We have an easy way out - SMALL64VNODEMAGIC
> LARGE64VNODEMAGIC. If the vnode has the old magic, it is not 64bit
> aware, and handle the content accordingly. This should make (e) fairly
> easy to implement. This should also allow a transparent in-place upgrade
> that will remain mostly backwards compatible. If the write taking place
> is going to require largefile support, update the vnode to use 64bit
> magic, otherwise mark it using the 32bit. Doing that will allow you to
> modify content in-place. 
> 
> I will be trying to put together a patch for Derrick to back out the
> current code to where the trunk fileserver is at least usable. Seems
> like no matter what we do, it'll be alot easier to apply stuff a small
> chunk at a time starting from a working base. Right now, things are in a
> bad enough state that I would hate to see how long it will take to get
> us back to a fully working system. 
> 
> -- Nathan
> 
> ------------------------------------------------------------
> Nathan Neulinger                       EMail:  nneul@umr.edu
> University of Missouri - Rolla         Phone: (573) 341-4841
> Computing Services                       Fax: (573) 341-4216
> 
> 
> 
>>-----Original Message-----
>>From: Hartmut Reuter [mailto:reuter@rzg.mpg.de] 
>>Sent: Thursday, March 20, 2003 3:27 AM
>>To: R. Lindsay Todd
>>Cc: Neulinger, Nathan; openafs-devel@openafs.org
>>Subject: Re: Yuck... largefile support really fouled things up bad...
>>
>>
>>
>>Sorry for my late reply, but I broke my leg and was in the 
>>hospital for 
>>some days.
>>
>>The point is:
>>1.) You can have large file support only with the namei-interface.
>>2.) the namei-interface needs AFS_64BIT_IOPS_ENV to be set 
>>because the 
>>inode number used in namei_ops.c is 64 bit long.
>>3.) the field vn_ino_hi used to store redundantly the same 
>>contents as 
>>the field uniqifier. Therefore I thought it would be better to modify 
>>VNDISK_GET_INO and VNDISK_SET_INO in order to use immediatly 
>>uniquifier 
>>and to use vn_ino_hi for the high order 32 bits of the file length.
>>4.) You never will be able to change an old fileserver to a 
>>large file 
>>supporting fileserver by just installing the new binaries. 
>>You will have 
>>to move the volumes to the new server. During the move 
>>process the old 
>>contents ov vn_ino_hi is not transfered and the field is cleared.
>>
>>Hartmut
>>
>>
>>
>>R. Lindsay Todd schrieb:
>>
>>>I can live with this change, and certainly agree that it is 
>>
>>better to 
>>
>>>make this change now than later.
>>>
>>>/Lindsay
>>>
>>>Neulinger, Nathan wrote:
>>>
>>>
>>>>How do y'all feel about maintaining on-disk compatability 
>>
>>with adding
>>
>>>>the largefile fileserver support to openafs? There are 
>>
>>some issues with
>>
>>>>the current trunk code that make assumptions that are not 
>>
>>correct (such
>>
>>>>as any 64bit_env build MUST HAVE largefile_env defined, or 
>>
>>it doesn't
>>
>>>>work right). Also assumes that we will never be able to do 
>>
>>both 64bit
>>
>>>>iops and large files in the same binary.
>>>>
>>>>I'd like to use to reserved6 field in the vnode disk 
>>
>>structure to add a
>>
>>>>length_hi, and start with the attached patch, plus other 
>>
>>sanity checking
>>
>>>>code will need added. Since this code hasn't ever been in 
>>
>>a real release
>>
>>>>of openafs, now is the time to decide how to do it.
>>>>Current implementation forces some dead ends, seems like using the
>>>>reserved slot is the way to go.
>>>>Derrick wanted me to talk to both of you first before he started
>>>>applying these changes. He did apply one I sent that at 
>>
>>least gets rid
>>
>>>>of the failure I talked about on -devel yesterday.
>>>>Do either of you object to using reserved6 as length_hi, 
>>
>>and eliminating
>>
>>>>the field re-use that is currently in place?
>>>>
>>>>-- Nathan
>>>>
>>>>------------------------------------------------------------
>>>>Nathan Neulinger                       EMail:  nneul@umr.edu
>>>>University of Missouri - Rolla         Phone: (573) 341-4841
>>>>Computing Services                       Fax: (573) 341-4216
>>>>
>>>>
>>>> 
>>>>
>>>>
>>>>>-----Original Message-----
>>>>>From: Derrick J Brashear [mailto:shadow@dementia.org] 
>>
>>Sent: Thursday, 
>>
>>>>>March 13, 2003 12:54 PM
>>>>>To: Neulinger, Nathan
>>>>>Subject: RE: Yuck... largefile support really fouled 
>>
>>things up bad...
>>
>>>>>On Thu, 13 Mar 2003, Neulinger, Nathan wrote:
>>>>>
>>>>>  
>>>>>
>>>>>
>>>>>>If largefile_env is defined, it forces namei.
>>>>>>    
>>>>>
>>>>>that doesn't conflict with my statement.
>>>>>
>>>>>  
>>>>>
>>>>>
>>>>>>largefile_env is currently incompatible with 64bit_iops     
>>>>>
>>>>>(it's reusing
>>>>>  
>>>>>
>>>>>
>>>>>>vn_ino_hi, which iops uses), but the only reason it doesn't     
>>>>>
>>>>>clash/fail
>>>>>  
>>>>>
>>>>>
>>>>>>is that it is forcing namei.
>>>>>>    
>>>>>
>>>>>ok, so talk to lindsay or hartmut and see if they care if 
>>
>>we switch.
>>
>>>>>  
>>>>
>>>>
>>>> 
>>>>
>>>
>>
>>-- 
>>-----------------------------------------------------------------
>>Hartmut Reuter                           e-mail reuter@rzg.mpg.de
>>					   phone +49-89-3299-1328
>>RZG (Rechenzentrum Garching)               fax   +49-89-3299-1301
>>Computing Center of the Max-Planck-Gesellschaft (MPG) and the
>>Institut fuer Plasmaphysik (IPP)
>>-----------------------------------------------------------------
>>
>>
> 
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel


-- 
-----------------------------------------------------------------
Hartmut Reuter                           e-mail reuter@rzg.mpg.de
					   phone +49-89-3299-1328
RZG (Rechenzentrum Garching)               fax   +49-89-3299-1301
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------