[AFS3-std] rxgk implementation notes

Thu, 28 Feb 2013 17:05:07 -0500 (EST)

I've got a partial implementation of rxgk, just enough to perform the 
negotiation exchange to get a token and then use that token to "secure" 
an RXGK_LEVEL_CLEAR connection which is used to perform another 
token negotiation exchange.  (This implementation is using the token 
format &c. from the rxgk-afs document.)

As a result of this, I have a few comments about the rxgk document 
implementation experience:

We still have some places that refer to or imply an ordering of security 
levels, such as "this MUST NOT be less than the security level originally 
negotiated"; the OpenAFS rxkad implementation has this sort of assumption 
baked in, in that the client and server track a "minimum security level" 
in the security object private data, and make numeric comparisons against 
that value.  We used to have a channel-bound security level, and 
conceivably one may get added in the future.  It would be nice to either 
settle on "a token only supports exactly the level it was negotiated for" 
or some better ordering on the values of security levels.  Having the 
client and server explicitly track all allowable security levels would be 
somewhat annoying.

Having a mixture of RXGK_Data and plain XDR opaque types gets a bit 
annoying; in some places I end up having one variable of each type that 
alias the same storage.

It's hard to get the logic right for when to terminate a GSS negotiation 
loop, especially when the same C variable is used to hold major/minor 
status information from both the client and server calls.  I still owe us 
an update to the text in the document on this matter.

Having publicly-visible routines that take a gssapi context or other 
gssapi type is pretty annoying for the code structure, as it makes 
gssapi.h a prerequisite for the rxgk header.  This is rather inevitable, 
but it may prove convenient to split that stuff off into a separate 
header/library only used by the negotiation service.

Additionally, the gssapi libraries shipped with some OSes do not support 
the pseudo-random functions that we need.  This is just something that 
implementations will have to deal with.

It takes a fair amount of code to prepare the several structures that we 
specify as opaque or RXGK_Data typed objects that are the XDR encoding of 
some more detailed type.  Manually specifying bit layouts would be 
differently annoying though, I'm sure.

Relatedly, there's not an obvious error code to return when such encoding 
fails or we can't allocate memory for the new object.  I've been using, 
e.g., RXGEN_SS_MARSHAL for now, but I wonder if there are more RXGK error 
codes waiting to be added, here.  There's also some temptation to make 
things like BADLEVEL apply to the authentication step as well as just 
token negotiation; that would probably warrant changing the associated 
error string.

We say "rxgk challenges simply contain some versioning information and a 
random nonce selected by the server", though the current RPC-L is just a 
nonce.  It sort of seems like it may be worth adding a version field in 
case the challenge/response protocol ever needs to be rev'd.

The transport key is derived from the master key and a few pieces of 
information that are sent from client to server in the clear; it therefore 
seems that the utility of the transport key for avoiding key sharing 
mostly comes into play when large quantities of data are being conveyed. 
It is still useful as a mechanism for rekeying after lifetime or bytelife, 
though.  (I think, at least -- I haven't implemented rekeying yet.)

Endianness conversion routines for 64-bit quantities (i.e., rxgkTime) do 
not seem to be universally available; an incorrect hand-rolled 
implementation could lead to interoperability issues.

It's a bit awkward to include call number information in the authenticator 
(read: feels like a layering violation), and it's not clear to me what 
benefit is gained from doing so.  I vaguely remember seeing some 
discussion of this in an archive somewhere, but can't find it now.  All I 
see looking nowis discussion of using the maxcalls information as a way to 
migrate to having more than 4 calls per connection, which doesn't sound 
familiar.
We also don't mention whether call numbers in use are checked as part of 
verifying the authenticator -- I do not currently do so.

Strictly speaking, this affects the rxgk-afs document, but including the 
client UUID in the authenticator's appdata field is pretty awkward.  This 
field "contains the UUID of the client", which sounds like it should be 
the cache manager's idea of the UUID, but getting that from the rx layer 
seems to be a layering violation, and we may have situations (e.g., bos) 
where an rxgk client is not running a cache manager.  At the moment I just 
generate a fresh one and don't store it anywhere, which seems 
counterproductive.

-Ben