[AFS3-std] draft-wilkinson-afs3-rxgk-afs-01 comments

Simon Wilkinson simon@sxw.org.uk
Sun, 10 Jun 2012 13:24:59 +0100


On 4 Jun 2012, at 01:44, Andrew Deason wrote:

I've made the typographical changes. I don't see any point in =
enumerating them again here.

> Less trivial:
>=20
> As a general comment, there is no section for the AFS registrar. Since =
we're
> creating new RPCs, not only does this section need to exist, it also =
needs to
> request that RPC code points be allocated.

I'll add a section for the registrar.

> -- "securityIndex" is not defined; there should be a reference to the =
Rx
>    specification.

We have lots of RX terminology in this document that we can't usefully =
define. There's nothing particularly special about securityIndex. I =
suppose that this sentence could be amended to something like "rxgk has =
an RX securityIndex value of 4", to make it clear that we're talking =
about an RX parameter.

> -- We don't specify the Rx service ID for the rxgk key negotation =
service,
>    and neither does the rxgk document.

Good catch. 34567 is the value that's been in use since Love's original =
implementation, but I should check whether this is allocated by the =
registrar. This needs to be specified in the rxgk document, though, =
rather than here.

> -- This section states that GSSNegotiate should only be utilized with =
database
> servers, however it does not specify what should happen if not all =
database
> servers support rxgk and/or do not have the common keying material.

All database servers are expected to have identical configuration - rxgk =
is no exception to that. Given that this is a requirement of Ubik =
itself, I don't think we need specific language in this draft to address =
this.

> 3.1
>=20
> -- Penultimate paragraph: "This is the only situation in which an rxgk =
capable
> client operating within an rxgk enabled cell may downgrade its choice =
of
> security layer."
>=20
> This seems excessively strong, since this seems to imply that a client =
MUST
> NOT downgrade in any other situation.

I'm happy with this being strong. Downgrading in other situations is a =
clear security risk. We should perhaps also note this in the security =
section of the document, but I think making it a requirement that =
compatible clients not perform arbitrary downgrades is fine.=20

I would propose changing the language here to reflect this, and replace =
this sentence with the one you suggest -

"An rxgk capable client operating within an rxgk enabled cell MUST NOT =
downgrade its choice of security layer in any other situation"

> 4.1:
>=20
> -- This seems really vague about where this structure actually goes. I =
believe
> it is XDR-encoded, and then put in the opaque field for tokens from =
the rxgk
> specification. But currently this is not specified.

Perhaps something like "The RXGK_TokenContainer structure is XDR encoded =
and
transported within the 'token' field of the RXGK_ClientInfo structure =
specified in
[I-D.wilkinson-afs3-rxgk]"

> 4.2:
>=20
> -- 2nd paragraph, last sentence: This seems like maybe something that =
the
> administrator should be doing, rather than the protocol implementor. =
Is RFC
> 2119 language appropriate in this situation?

I'm not sure - it's both implementor and administrator dependent. =
Implementations
need to support using and storing multiple keys, so both old and new =
keys can be
used at the same time. They also have to support incrementing the key =
version
number whenever the key is changed. It's only in certain situations that =
this
burden also falls on the administrator.

> 4.3:
>=20
> -- Again, this is a little unclear on where this structure actually =
goes. It
> is XDR-encoded, encrypted, and then the result goes in the =
TokenContainer
> encrypted_token field.

The final paragraph of section 4.2 discussed this. I suspect to make =
things
clearer, an additional sentence should be added at the end of this =
paragraph,
saying "The encrypted data is stored in the encrypted_token field of the=20=

TokenContainer structure described in section 4.1"

> -- The structure name is also confusing, since the rxgk specification =
defines
> a type called RXGK_Token, which is just an opaque blob.

That's actually an editing mistake in the rxgk specification - that =
field should
actually be called RXGK_Data. Given that that document needs respun =
anyway, I
would propose fixing this there.

> -- The definition of 'level' is vague; it should say it refers to a =
specific
> rxgk integrity level, and reference the section where they are defined

Cross references between documents are very fragile at this stage. I =
would be
tempted to say something along the lines of

"level: The rxgk security level, as defined in [I-D.wilkinson-afs3-rxgk]
 that MUST be used for this connection"

It should also be an RXGK_Level, and not an int.

> -- startTime should probably be an rxgkTime. The definition also seems =
a
> little oddly worded; we should say that servers MUST reject attempts =
to start
> connections with that token after startTime

Yes, it should be an rxgkTime. I don't think your proposed definition is =
correct - a token can only be used after its startTime. How about

"Servers MUST reject attempts to use tokens with a startTime value later =
than the current time"

> -- For lifetime and bytelife, I find "This is an advisory limit" to be =
vague;
> if they can simply be ignored, we should say that adhering to and =
enforcing
> these limits is OPTIONAL.

The exact behaviour, with relation to this, is specified in the rekeying =
section
of the RXGK document. How about just replacing the "advisory" language =
with:

   lifetime:  The maximum number of seconds that a key derived from K0
         may be used for, before the connection is rekeyed.  If 0, keys =
have
         no time based limit.

   bytelife:  The maximum amount of data (expressed as log 2 byes) that
         may be transferred using a key derived from K0, before the =
connection
         is rekeyed.  If 0, there is no data based limit on key usage.

> 5:
>=20
> -- I would find this clearer if we had an RPC-L definition with a =
structure,
> even if that structure just contains a single element, which is an =
afsUUID.

I think we can do this without wire level changes, so how about

   The appdata opaque within the RXGK_Authenticator contains the results =
of
   XDR encoding the RXGK_Authenticator_Contents structure. The uuid =
field
   contains the UUID of the client.

        struct RXGK_Authenticator_Contents {
            afsUUID uuid;
        }

> -- This section implies that an attack exists if we just use the =
users' tokens
> to fetch data, but it doesn't provide a lot of details. I would guess =
that the
> relevant attack is if a user poses as a fileserver, and can supply bad =
data
> to the victim client, while still encrypting the data. If so, this =
should be
> made more clear, but it also seems like this is always a problem for =
existing
> rxkad, and for all rxnull connections, which should be stated if true.

The attack is that an authenticated user requests data from the =
fileserver through
the cache manager. As the user knows the session key, they can =
impersonate the
fileservers responses to the cache manager, and so insert bad data into =
the cache.
The aim of combining the cache managers key with the users is that we =
get a session
key that is unknown to the user, and so they can't fiddle with traffic =
protected using
it. As you note, this is a problem with both rxkad and rxnull as well. =
However, this document isn't supposed to serve as a guide to RX security =
issues, or to clarify the situation with existing mechanisms. My feeling =
was that the text as it stands provides sufficient information to =
justify the feature, without going into too much detail.

> It also seems a little odd that we appear to be using the CM-specific =
keying
> data to verify data integrity, and not something from the fileserver =
or
> cell-wide preshared key. If that's unavoidable, that's okay, but it =
seems a
> bit odd, unless I'm misunderstanding something.

If you consider the security model, I don't think it's odd at all. On a =
multi-user machine the CM acts as a proxy, and protector, for a group of =
users. When data is fetched from a fileserver it isn't just requested on =
behalf of a single user, but instead for that user, and the cache =
manager itself. So, using both keys to secure this requests seems to =
make sense.

Given that this is all built on symmetric encryption, using the =
preshared-key isn't viable, as the client doesn't (and cannot) have any =
knowledge of this key.=20

We have three security relationships that are required:
1/ The user knows that it is talking to a genuine fileserver
2/ The fileserver knows that it is talking to a particular user identity
3/ The cache manager knows that it is talking to a genuine fileserver

1 and 2 are established, through the user's rxgk token. 3 is established =
through the cache manager's rxgk token. Combining both tokens together =
gives us a single token which establishes all 3 desired security =
relationships.

> 6.1:
>=20
> -- 1st paragraph, last sentence: should this read "The client SHOULD
> frequently renew"? This seems like it would be helpful to say that the =
client
> SHOULD NOT renew more frequently than, say, lifetime/4.

Yes it should be capitalised. The reason for this requirement is that =
the lifetime of a combined token is the smallest lifetime of the tokens =
combined into it. So, a client which doesn't frequently renew its tokens =
will end up creating very short-lived combined tokens.

> 7.1:
>=20
> -- "Printing" tokens appears to be a term invented by the document to =
discuss
> such tokens. I don't find this very clear, since the document just =
starts
> using the term; this should be mentioned, and the definition =
explicitly
> stated. This could probably just go at the end of section 7's =
preamble, since
> it already seems to be describing what printed tokens are.

How about:=20
        We refer to the process of forging tokens for local use,
	given access to the cell-wide pre-shared key as "token printing"

> -- We should have a reference to the VVL spec for VL_RegisterAddrs =
mentions

Yeah, we probably should. What is the best way of referring to this =
document?

> 12.2:
>=20
> -- I agree this is a problem with this approach, and makes me never =
want to
> use it. However, the section doesn't really give an indication of what =
to do
> about it. Should the automatic registration never be used? Or should =
it just
> be used by sites which have less concern for security, and the greater
> convenience of the automatic registration outweighs the downside? If =
so, it
> should be mentioned.

This is actually pretty common. For example, every time you use kadmin =
to extract a new keytab for a service, you're using your short lived =
session key to protect the long term service key. I don't see it as a =
particular problem for most environments, but thought it was worth =
pointing out, as it might be an issue for some.

Thanks for the exhaustive review!

Cheers,

Simon.