[AFS3-std] Last Call: afs3-rxgk-04, afs3-rxgk-afs-02

Benjamin Kaduk kaduk@MIT.EDU
Fri, 26 Apr 2013 13:03:35 -0400 (EDT)


On Thu, 25 Apr 2013, Simon Wilkinson wrote:

> Here are some comments on afs3-rxgk-04. Many of these are editing notes, 
> but I think this document needs further work with regards to the 
> description of the GSSAPI handshake, and a fuller discussion of limits 
> to the sizes of the various opaque data types is required.

Thanks for the comments; replies inline where appropriate.
I don't think there was any discussion at all about the size of the opaque 
types; I just came up with some numbers and put them in.
There are a couple of omissions in the description of the GSS handshake 
but I do not think any fundamental flaws.  A rewrite could potentially be 
easier to follow, but may not be necessary.

> Abstract
> --------
>> This document provides a general description of
>> rxgk. A further document will provide details of integration with
>> specific RX applications.
>
> I think we're now at the point where this document provides information 
> on how to integrate rxgk with generic RX applications - the only 
> information in rxgk-afs is how to integrate it with the AFS-3 protocol 
> family - so I think this could be clearer. Perhaps:

Well, this document does not define a token format, so there remains work 
for any application wishing to use rxgk.  Whether or not that falls under 
the mantle of "integration" is debatable, I suppose.

> "This document provides a general description of rxgk and how to 
> integrate it into generic RX applications. Application specific 
> behaviour will be described, as necessary, in future documents"

I'm happy to take this change, regardless.

> 6. Key Negotiation
> ------------------
>
>>        const RXGK_MAXENCTYPES = 10;
>
> Limiting to a maximum of 10 encryption types seems somewhat low, given that this is a hard limit that can't be negotiated around. Each encryption type only takes 4 bytes on the wire, so there doesn't seem to be a resource-based limit here. Given the sprawling proliferation of Kerberos encryption types, I can see somebody, at some point, wanting to use more than 10 here.
>
>>        const RXGK_MAXLEVELS = 10;
>
> Whilst we only support 3 levels today, again a hard limit of 10 seems low.
>
> How about a limit of 255 for each of these?

I don't mind bumping these to 255; part of the justification for a 
small-ish limit of 10 is that if the client really does have more options, 
it can call the negotiation rpc multiple times, iterating through chunks 
of its list.  We have error values for bad enctype and bad level which 
would prevent that from becoming an N^2 problem.

>>       const RXGK_MAXMIC = 16384;
>
> By contrast, this seems large. The MIC is going to be the length of the 
> chosen encryption type's checksum, plus a small amount of header 
> material. For aes-256, that's a mere 28 octets (16 of CFX token, 12 of 
> checksum). So, I think we could get away with something smaller here.

I note that you do not suggest a new value :)
This is large, it is true.  On the other hand, the goal of putting a limit 
here is to avoid excessive resource allocation by the rx peer without 
getting in the way of normal operation ... on the scale of 2**31, it's not 
very large at all.  Would you want to drop down to, say, 1024, or even 
smaller?

>>       typedef opaque RXGK_Data<>;
> [ ... ]
>>           RXGK_Data token;
>
> Given we're limiting everything else, should we also put some kind of 
> upper limit on the size of the token? We do need to consider, though, 
> that a token may contain non-Kerberos identities, and X509 (in 
> particular) identities can be very large.

There should probably be a limit of some sort.  It's unclear whether this 
limit should be different from a hypothetical global default limit on the 
size of an opaque type (per the thread Andrew started).  (My picture of 
this is that openafs's rxgen would translate opaque<> into, say, 
opaque<16k>, but that opaque<1m> given explicitly would still be allowed. 
Numbers not guaranteed to bear resemblence to reality.)

If we were to explicitly limit the size of a token (or a generic 
RXGK_Data, I suppose), what would such a limit be?  If we start talking 
about X.509 identities, maybe 16k is too small?

>>       GSSNegotiate(IN RXGK_StartParams *client_start,
>>            IN RXGK_Data *input_token_buffer,
>>            IN RXGK_Data *opaque_in,
>>            OUT RXGK_Data *output_token_buffer,
>>            OUT RXGK_Data *opaque_out,
>>            OUT unsigned int *gss_major_status,
>>            OUT unsigned int *gss_minor_status,
>>            OUT RXGK_Data *rxgk_info) = 1;
>
> Same question here - RXGK_Data has no limits.

But what would the limit be?
I suppose I can mention my commit message for when I went and put in 
length restrictions:
%%%%%%%%%%%%%%
     Supply bounds for variable-length arrays

     Define constants and use them.

     There is no limit for the RXGK_Data typedef at this time, though that
     is perhaps the most important case to consider.  Such a limit should
     probably be merged in with a prospective limit on the generic OpenAFS
     rx_opaque type.

     The authenticator limit implicitly limits the appdata and call_numbers
     fields, so separate limits are not needed there.
%%%%%%%%%%%%%%

It is tempting to say that we need not talk about a global limit for this 
document and can move such discussion to the thread Andrew started 
("Unlimited array lengths").  I suppose I can look if there is literature 
on the size of GSS negotiation tokens for guidance.

>> This nonce SHOULD be at least 20 octets in length.
>
> Is SHOULD strong enough here? Using a nonce that is too short will 
> reduce the security of the resulting key.

Will it?  The client_nonce is sent in the clear ... my take is that its 
primary purpose is as a uniquifier for the startparams structure (which is 
MIC'd for the server reply to prevent MITM).  Anyway, the GSS mech is 
supposed to be able to produce non-guessable output from 
gss_pseudo_random() even if no seed is provided, I thought.

>> The GSS service name is application dependent.
>
> We've described how to construct the acceptor name earlier in this 
> section - perhaps refer back to that?

Sure, good catch.

>> The client then calls RXGK_GSSNegotiate, as defined above.
>
> The client should only call RXGK_GSSNegotiate if the call to 
> GSS_Init_sec_context succeeded

This text is supposed to be an introductory preface with a high-level 
overview of what goes on, which is described in great detail later on. 
I'm not fully convinced that this particular text needs changing.

>>  client_start  The client params structure detailed above.  This
>>         should remain constant across the negotiation
>
> "should" or SHOULD, or even MUST?

If we went with MUST then I would want the server to enforce its 
invariance, which would mean more stuff in the opaque.  Even for SHOULD, 
it's not clear how much value would be added -- the server can't construct 
a token until gss_accept_sec_context() returns completion, and since the 
UDP server is stateless, it might as well do all the input validation then 
as well.  (Of course, it could do validation earlier and save state in the 
opaque.)  The other question to consider is what behavior the client 
should expect if it does change start parameters between calls in the same 
negotiation loop.  I could see it plausible that the client should be 
prepared to accept any combination of the options it ever passed, but most 
likely the client expects to get what it sent last (consistent with the 
server having to do validation when it generates the token, which is at 
the last call).  After all, the client has to validate the MIC of the 
startparams structure which the server sends back in the reply.  In this 
sense, only the last version sent "really matters".  The server could do 
some sanity checking every time (before calling into GSS, even), but I see 
that as mostly an optimization.  The onus is on the client to keep things 
consistent, but I'm not convinced there's a need for RFC2119 language 
here.

(This discussion remains entirely theoretical for me, though, as my 
implementation does not handle mechs which require more than one call to 
gss_accept_sec_context.)

>>   Each call to GSSNegotiate will return an output token from
>>   GSS_Accept_sec_context() and/or an output opaque
>
> I don't think that "and/or" is appropriate here. Output opaques are 
> separate from the negotiation itself - they just serve as a means for 
> the server to preserve state across multiple RPCs.

That is the function of output opaques, yes.  It's not clear why this 
affects the appropriateness of "and/or", though.  I don't see any problem 
with a server implementation that always returns an opaque with state 
information, even if no further call is needed.
Other cases which should definitely be allowed are when the client has 
already succeeded but has a GSS token for the server; in this case the 
server succeeds and needs send neither GSS token nor output opaque.  (Per 
the above, it could send an opaque if it wanted to, though.)
If the server's state is GSS_S_COMPLETE but it has an output GSS token for 
the client, there is no need to send an output opaque, but the GSS token 
must be returned.

I see all four combinations of (token/no token, opaque/no opaque) as 
possible; thus, "and/or" seems appropriate.

>>         If the major status code from GSS_Init_sec_context() includes a
>>         fatal error code, the negotiation loop is in an error condition
>>         and terminates.
>
> "includes a fatal error code" doesn't seem clearly defined. How about 
> just "indicates a GSSAPI error"

Okay.  This language was inspired by the apparent possibility for a major 
status being returned for which GSS_ERROR(major) is true but auxiliary 
informational bits are also set.  Your version is better.

>>         If the major status code is GSS_S_COMPLETE and
>>         the mutual_state flag is not true, or the major status code is
>>         GSS_S_COMPLETE and the conf_avail flag is not true, or the
>>         major status code is GSS_S_COMPLETE and the integ_avail flag is
>>         not true, the negotiation loop is in an error condition and
>>         terminates.
>
> This could be expressed much more succinctly as:
> "If the major status code is GSS_S_COMPLETE and the mutual_state, 
> conf_avail flag and integ_avail flags are not all true, the negotiation 
> loop is in an error condition and terminates"

Sure.  I think I did not have all the flags at once, so the language was 
added for each in sequence.

>>         If the major status code is GSS_S_COMPLETE and the
>>         output token is zero length, this is a success condition and
>>         the negotiation loop terminates (this cannot happen on the
>>         first iteration of the loop).  Otherwise, if the major status
>>         code does not include GSS_S_CONTINUE_NEEDED, the negotiation
>>         loop is in an error condition and terminates.  If the major
>>         status code includes GSS_S_CONTINUE_NEEDED, the output token is
>>         sent to the server, per the next step.
>
> This paragraph suggests that returning GSS_S_COMPLETE from 
> GSS_Init_sec_context() and a non-empty output token is an error. All of 
> the other GSS negotiation loops that I'm aware of permit this behaviour 
> (the output token is simply sent to the server). This behaviour is also 
> specifically permitted by RFC 4462.

Oops.  GSS_S_COMPLETE with non-empty token is allowed, but is neither a 
success nor failure condition for the loop (as the token must be sent to 
the server first).  Clearly the thoughts in my head did not match the 
words on the paper when I was writing this.

I guess we need to add "If the major status code is GSS_S_COMPLETE and the 
output token is of nonzero length, the negotiation loop proceeds and the 
token is sent to the server." before "Otherwise".

>>        If the most recent call to
>>        GSS_Init_sec_context() returned a major status code including
>>        GSS_S_CONTINUE_NEEDED and the output_token_buffer is zero
>>        length, the negotiation loop is in error and terminates.
>
> This doesn't need to be explicitly stated - if it is an error, then 
> Gss_Init_sec_context() will handle it when we feed it the zero length 
> token.

I guess so.  I don't think it's actively harmful to say so, but I can 
remove the sentence if you want.

> Also:
>
> When the server receives GSS_S_COMPLETE from Gss_Accept_sec_context, it 
> also needs to check the context flags and make sure that at least 
> conf_avail and integ_avail are true. It needs to do that before it 
> constructs the ClientInfo structure.

Thanks, good catch -- I only mentioned that the client had to check.

> I found the description of the loop hard going, and I know what it 
> should do! I'm not sure how best to improve this, but as I've said 
> before, I believe the language, and layout, of the description in the 
> SSH GSSAPI RFC to be a good place to start.

I did try to use that as guidance while rewriting, though a directly 
parallel structure would require completely changing the nature of the 
exposition.  I will go reread their text and think about whether I want to 
go that route...

>
> 7.3 RPC Definition
> ------------------
>
>>       struct RXGK_TokenInfo {
>>           int errorcode;
>
> I'm not convinced that we need a separate error code here. Clients 
> shouldn't be making negotiation decisions based upon the results of 
> CombineTokens - a failure of this function should just be reported up to 
> the user.

This came in back in November when we realized that (at that time) 
CombineTokens as written was useless, and we pulled in all of the 
interesting bits from AFSCombineTokens.  At the moment, the RXGK_TokenInfo 
data structure is shared between CombineTokens and AFSCombineTokens; the 
latter surely does need an in-band errorcode field.  I would kind of like 
to avoid having an abundance of similar-but-different data structures and 
keep this sharing, even if the errorcode might not be strictly necessary 
here.  (I haven't implemented a CombineTokens yet, so I don't have insight 
to add from experience.)

> 8.5 The Response
> ----------------
>
>>   const RXGK_MAXAUTHENTICATOR = 1500  /* better fit in a packet! */
>
> The normal maximum size of an RX payload is 1416 bytes, so an 
> authenticator of 1500 definitely won't fit in a packet! rxkad ends up 
> taking liberties here, and just extends the maximum size of packet when 
> the data it needs to transmit is too large, but that causes issues on 
> networks that drop fragmented UDP. It would be good to avoid those 
> here...

So you want to change 1500 to 1416?  I don't mind -- as I said, these 
constants were just things I made up and they didn't get much discussion.

> 12.3 Nonce Lengths
> ------------------
>
>>  The client_nonce is important for using the MIC of the StartParams
>>   structure as a defense against man-in-the-middle attacks,
>
> I'm not sure what this sentence is trying to say?

The hypthetical scenario here is where a client gets two tokens using the 
same kerberos service ticket.  In my thought experiment, an active 
attacker could change the startparams for the second request to the 
server, and replace the 'mic' field in RXGK_ClientInfo with the value from 
the first request.  Oops, but RXGK_ClientInfo is encrypted for 
transmission, so things would break pretty fast.

Anyway, since client_nonce is sent in the clear, it is not adding secure 
randomness.  It does make each request unique.

-Ben