[OpenAFS-devel] Re: [OpenAFS] 2.6 kernel support anytime soon? Workarounds?

Jeffrey Hutzelman jhutz@cmu.edu
Tue, 11 May 2004 20:40:24 -0400


On Tuesday, May 11, 2004 19:07:42 -0400 Garance A Drosihn <drosih@rpi.edu> 
wrote:

> At 1:19 PM -0500 5/11/04, Neulinger, Nathan wrote:
>> I think Jeff's numbers were a bit too wide... I believe that if
>> you reserve over >=32512 you should be fine.
>>
>> i.e. 7F00 - BF00 would be the range you want to avoid...
>>
>> -- Nathan
>>
>> ------------------------------------------------------------
>> Nathan Neulinger                       EMail:  nneul@umr.edu
>> University of Missouri - Rolla         Phone: (573) 341-6679
>> UMR Information Technology             Fax: (573) 341-4216
>>
>>
>>  > -----Original Message-----
>>  > From: openafs-devel-admin@openafs.org
>>>  [mailto:openafs-devel-admin@openafs.org] On Behalf Of Garrett Wollman
>>>  Sent: Tuesday, May 11, 2004 1:13 PM
>>  > To: Jeffrey Hutzelman
>>  >
>>  > This could bite us very hard.  Our users are set up with primary
>>  > gid == uid, and the shared UID space runs from 5000 to 32000.
>>  > -----End Message-----
>
> If I am reading the include-files right, in the land of freebsd
> we have gid's that are 32-bits wide.  aka __uint32_t
>
> Could we just move the range used by AFS for PAG-values?

Yes; it would be fairly easy to change the mapping to use any contiguous 
block of 0x4000 group ID's.  Using larger-than-16-bit ID's might be a bit 
trickier, but not much.  I'm sure the gatekeepers would be happy to take a 
patch, and I'll even try to point folks in the right direction...


The functions you care about afs in src/afs/afs_osi_pag.c; particularly, 
afs_get_pag_from_groups() and afs_get_groups_from_pag(), which do the 
translation in both directions.

I explained how the algorithm works in a previous message.  Ultimately, we 
compute two values, g0 and g1, which will be used to set the contents of 
the first two slots in the supplementary groups list.  Each of these values 
is constrained to be in the range 0x4000..0x7FFF.  This is because of the 
method used to compute the high 2 bits of each value, and because the PAG 
number is constrained to be in the range 0x41000000-0x41FFFFFF (see the PAG 
assignment algorithm in genpag()).

You'll note that an offset of 0x3F00 is added to each of g0 and g1 before 
they are stored in groups, and subtracted off.  This controls where the 
block of reserved GID's will begin.  It would be trivial to make this value 
configurable, allowing the block of reserved GID's to be moved to any 
desired location.  Because such a change would affect the way the 
supplementary group lists of existing processes are interpreted, it should 
probably be changed only when afsd is started.  The appropriate way to do 
this would be with a new command-line switch to afsd, which would set a new 
paramater passed to the cache manager via a new AFSOP_* operation.  The 
handling of AFSOP_SET_FAKESTAT in src/afsd/afsd.c and src/afs/afs_call.c is 
probably a pretty good example of how to do this.


Note that the block of reserved ID's should not be moved such that it 
overlaps with 60001-60002 or the few ID's just below 0x10000, since some 
systems use those numbers for "nobody".  It's probably not necessary to 
provide code to prevent this, though.

Moving the block entirely above 0x10000 should be possible on systems where 
gid_t is actually larger than 16 bits.  It's probably best to enlarge the 
short g0,g1 in afs_get_groups_from_pag().

Changing the algorithm so that the block of reserved ID's is smaller than 
0x4000 ID's should be possible, but it would not be trivial to configure. 
Since the high 8 bits of PAG values are fixed, it would probably be 
reasonable to encode only the low 24 bits in groups (12 bits in each of 2 
groups).  That would reduce the size of the reserved group space to only 
0x1000 instead of 0x4000.  Alternately, on a 32-bit-group system one could 
reserve a full 0x1000000 GID's and store the whole PAG in one group.  That 
might be interesting if the extra supplementary group slot is more 
important to you than GID space.

Offhand, I'd say that a change to reduce the number of reserved ID's to 
0x1000 need not be configurable, as long as the location of the reserved 
space does not change.  A change that would reduce the number further at 
the expense of more group slots would have to be configurable, as would one 
that increased the number of ID's.

-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
   Sr. Research Systems Programmer
   School of Computer Science - Research Computing Facility
   Carnegie Mellon University - Pittsburgh, PA