[AFS3-std] last call comments on draft-wilkinson-afs3-rxgk-afs-07

Thu, 14 May 2015 17:23:20 -0400 (EDT)

On Mon, 4 May 2015, Michael Meffie wrote:

>
> This note announces the start of a one-week last call within the AFS-3
> Standardization Group on the following document:
>
> Title:           Integrating rxgk with AFS
> Filename:        draft-wilkinson-afs3-rxgk-afs-07
> URL:             https://www.ietf.org/archive/id/draft-wilkinson-afs3-rxgk-afs-07.txt

I've read through this document again, and overall, it seems in pretty
good shape.  There are a few changes I want to propose to the protocol
itself, though, as well as a number of minor tweaks to the text for
readability and some comments on potential discussion points.

protocol changes:

The specification for AFSCombineTokens has "IN afsUUID destination"; it
seems that this should be a pointer type instead, i.e., "IN afsUUID
*destination".

In section 9, when no cm_tok is supplied, the K0 from the user_tok is
currently reused unchanged in the new_token.  There doesn't seem to be any
particular reason for directly reusing this key material, so I would
propose that we use some sort of key-derivationi scheme with a fixed salt
and the fileserver uuid as a pepper, just from the sense of crypto best
practices against reusing key material for different applications.

Likewise, in the computation of the per-fileserver key after
VL_RegisterAddrsAndKey, it may be worth including a fixed pepper string
into the PRF+ operation to provide an extra layer of protection against
nonce reuse.

I do not believe that the document discusses the use/meaning of the kvno
field in the RXGK_ServerKeyDataResponse.  It is pretty clear from context,
that this is the vlserver telling the fileserver what version number is
associated with the generated key, but we need to introduce text saying
that.

Section 11.1's third paragraph says that cache managers should cache
callback key information in per-connection private data, implying that
CM-->fs connections and fs-->CM connections are the same objects.  They
are not, so this text is actively harmful.  I believe it can safely be
removed without replacement.

Some additional comments:

Section 5 talks about the "presence of" GSS identities, which is grounded
in a krb5 mindset -- not all GSS mechanisms can offer the initiator clear
(and authenticated!) evidence that a given name does not exist.  We only
use MAY language here, though, so I do not propose any changes at this
time.

The TokenContainer structure introduced in section 6.1 uses a signed type
for the key version number.  Kerberos uses an unsigned type for the key
version number, and the core rxgk document talks about a 16-bit unsigned
integer key version number in the rx header which MAY be stored locally
"as a 32-bit integer on both endpoints".  It's a bit late, but we might
want to make this type unsigned as well.

We should probably review all of the opaque types to see where length
limits are appropriate -- types declared as RXGK_Data inherit a limit from
section 4, but that type is not universally used throughout.  All the
appearances seem okay to me, since the token-related things are all going
to end up in RXGK_Data objects specified in the core rxgk spec, and the
RegisterAddrsAndKey structures also end up in bounds-limited arrays.

In section 6.3, the identities<> array of PrAuthNames is populated with
more than a single identity only by the generic CombineTokens procedure.
We're still sort of punting on the actual meaning of a list of length
greater than one, which I do not object to.

In sections 7.1 and 10, we talk about keyed clients and file servers
obtaining (database) tokens using GSS credentials they control.  For their
respective purposes, I think that these identities do not necessarily need
to have a pts id associated with those credentials, but wanted to mention
the issue on the list in case I'm wrong and we need to say something about
allocating pts ids for them.

This may just be a wordsmithing issue, but in section 9, we cover the
handling of the identities lists in AFSCombineTokens, wherein the list
from the cm_tok is discarded unused.  This seems to be, in effect, relying
on the client_uuid in the RXGK_Authenticator appdata to bind to the cache
manager machine, along with proof-of-possession semantics.  This is
probably okay, but may merit some additional eyes.

Section 10 does not explicitly mention intra-ubik traffic as
server-to-server communication (which is fine, since it's out of scope for
AFS-3 protocol discussions), but it is still server-to-server
communication.  I believe that all of the text present in that section
still is appropriate for intra-ubik connections, but wanted to mention
this as another place for extra eyes.

I slightly concerned about section 10.2.2, since I do not have any
implementation experience for it (and I do with almost all the rest of the
document), but I do not have any concrete concerns, and the proposal seems
reasonable as I read it.

In section 10.3, we explicitly say "vlservers MUST NOT permit calls to
VL_RegisterAddrsAndKey for fileserver UUIDs which already exist within the
vldb, unless that UUID already has a server-specific key registered."
First off, I'm not convinced that this is a strict interoperability
requirement, so a SHOULD NOT may be fine.  Also, we should probably say
why this is not desired (it's much easier to do a transition by spinning
up new machines for rxgk and moving volumes piecemeal than trying to
upgrade in-place which has lots of messy edge cases that might not be
implemented right).

Wordsmithing:

In section 2, s/AFS protocol/AFS-3 protocol/ seems appropriate.

The text in section 5.1 may be clearer if it explicitly says "as needed by
the implementation" instead of just "as needed".

Section 6.2 talks of encrypting tokens "both with a single cell-wide key
and with per-file-server keys", which reads as if both keys could be used
for (doubly?) encrypting the same token.  This should probably get
rewritten to use either/or language.  In the next sentence, "desired" is
not the right word -- per-fileserver keys may be desired but not in use
for some reason out of the control of the cell administrator.

In section 7.1, the language about "unnecessarily close expiration times"
is a bit awkward and could probably be improved, perhaps to "expiration
times that are unnecessarily close".  A couple sentences later, "domain
name of the cache manager" is probably a type error, since it is the
machine upon which the cache manager is running that has a domain name.

In section 9, we note that "An rxgk capable client operating within an
rxgk enabled cell MUST NOT downgrade its choice of security layer in any
other situation"; it may be worth mentioning explicitly that it will not
try rxgk at all for contacting a cell which is not rxgk enabled.

Section 10.2 has a "is is" which should be "it is".

Section 10.3 has "a manual, or out of band, key management system", and I
think that the "or" should be removed for clarity.

-Ben