[OpenAFS-devel] Re: [OpenAFS] 2.6 kernel support anytime soon?
Workarounds?
Jeffrey Hutzelman
jhutz@cmu.edu
Tue, 11 May 2004 20:40:24 -0400
On Tuesday, May 11, 2004 19:07:42 -0400 Garance A Drosihn <drosih@rpi.edu>
wrote:
> At 1:19 PM -0500 5/11/04, Neulinger, Nathan wrote:
>> I think Jeff's numbers were a bit too wide... I believe that if
>> you reserve over >=32512 you should be fine.
>>
>> i.e. 7F00 - BF00 would be the range you want to avoid...
>>
>> -- Nathan
>>
>> ------------------------------------------------------------
>> Nathan Neulinger EMail: nneul@umr.edu
>> University of Missouri - Rolla Phone: (573) 341-6679
>> UMR Information Technology Fax: (573) 341-4216
>>
>>
>> > -----Original Message-----
>> > From: openafs-devel-admin@openafs.org
>>> [mailto:openafs-devel-admin@openafs.org] On Behalf Of Garrett Wollman
>>> Sent: Tuesday, May 11, 2004 1:13 PM
>> > To: Jeffrey Hutzelman
>> >
>> > This could bite us very hard. Our users are set up with primary
>> > gid == uid, and the shared UID space runs from 5000 to 32000.
>> > -----End Message-----
>
> If I am reading the include-files right, in the land of freebsd
> we have gid's that are 32-bits wide. aka __uint32_t
>
> Could we just move the range used by AFS for PAG-values?
Yes; it would be fairly easy to change the mapping to use any contiguous
block of 0x4000 group ID's. Using larger-than-16-bit ID's might be a bit
trickier, but not much. I'm sure the gatekeepers would be happy to take a
patch, and I'll even try to point folks in the right direction...
The functions you care about afs in src/afs/afs_osi_pag.c; particularly,
afs_get_pag_from_groups() and afs_get_groups_from_pag(), which do the
translation in both directions.
I explained how the algorithm works in a previous message. Ultimately, we
compute two values, g0 and g1, which will be used to set the contents of
the first two slots in the supplementary groups list. Each of these values
is constrained to be in the range 0x4000..0x7FFF. This is because of the
method used to compute the high 2 bits of each value, and because the PAG
number is constrained to be in the range 0x41000000-0x41FFFFFF (see the PAG
assignment algorithm in genpag()).
You'll note that an offset of 0x3F00 is added to each of g0 and g1 before
they are stored in groups, and subtracted off. This controls where the
block of reserved GID's will begin. It would be trivial to make this value
configurable, allowing the block of reserved GID's to be moved to any
desired location. Because such a change would affect the way the
supplementary group lists of existing processes are interpreted, it should
probably be changed only when afsd is started. The appropriate way to do
this would be with a new command-line switch to afsd, which would set a new
paramater passed to the cache manager via a new AFSOP_* operation. The
handling of AFSOP_SET_FAKESTAT in src/afsd/afsd.c and src/afs/afs_call.c is
probably a pretty good example of how to do this.
Note that the block of reserved ID's should not be moved such that it
overlaps with 60001-60002 or the few ID's just below 0x10000, since some
systems use those numbers for "nobody". It's probably not necessary to
provide code to prevent this, though.
Moving the block entirely above 0x10000 should be possible on systems where
gid_t is actually larger than 16 bits. It's probably best to enlarge the
short g0,g1 in afs_get_groups_from_pag().
Changing the algorithm so that the block of reserved ID's is smaller than
0x4000 ID's should be possible, but it would not be trivial to configure.
Since the high 8 bits of PAG values are fixed, it would probably be
reasonable to encode only the low 24 bits in groups (12 bits in each of 2
groups). That would reduce the size of the reserved group space to only
0x1000 instead of 0x4000. Alternately, on a 32-bit-group system one could
reserve a full 0x1000000 GID's and store the whole PAG in one group. That
might be interesting if the extra supplementary group slot is more
important to you than GID space.
Offhand, I'd say that a change to reduce the number of reserved ID's to
0x1000 need not be configurable, as long as the location of the reserved
space does not change. A change that would reduce the number further at
the expense of more group slots would have to be configurable, as would one
that increased the number of ID's.
-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
Sr. Research Systems Programmer
School of Computer Science - Research Computing Facility
Carnegie Mellon University - Pittsburgh, PA