[OpenAFS-devel] Progress on Linux in-kernel RxRPC library

Tue, 22 Mar 2005 20:54:00 -0500

On Mar 22, 2005, at 14:49, Jeffrey Hutzelman wrote:
> On Monday, March 21, 2005 07:11:30 PM -0500 Kyle Moffett 
> <mrmacman_g4@mac.com> wrote:
>> The security bit in that was just some random brainstorming.  I 
>> haven't
>> actually written any code for that bit yet.  If you have any better 
>> ideas,
>> I'd be glad to hear them.  I'd prefer using keys via lookup in the 
>> Linux
>> keyring system, although that won't work for other platforms, on Linux
>> it's a cleaner API.
>
> I really have to think about this some more.  A lookup method will 
> work for
> rxkad, but it may mean you can't make the fileserver start accepting a 
> new
> key version just by adding it to the KeyFile, which I think works 
> today.
>
> I expect a lookup method can also be made to work for rxgk.  I 
> envision the
> context establishment being done in user mode (it's actually done via a
> series of unauthenticated RPC's to a special service).  The result of
> context establishment is that the client gets a token it can use in the
> Rx security exchange; part of that token contains data encrypted in a 
> key
> known only to the server.  It seems like it would work quite well to 
> give
> the kernel a copy of that key, and let it do the Rx security exchanges.
>
> I'm just trying to wrap my head around whether using a keyring lookup 
> is
> too restrictive.

I haven't even really thought much at all about an API yet, so if you 
have
any better ideas or brainstorms, I'd be glad to hear them.

> A few other thoughts...
>
> Rx has an additional layer of multiplexing below the UDP port, in which
> each request is for a particular service (named by an integer).  
> Ideally,
> it would be possible to create separate listen sockets for separate
> services on the same port.  To pick an extreme example, in a kaserver
> compiled with rxgk support, there may be as many as 7 services (the 
> ubik
> vote and disk services, rx stats, rxgk context establishment, and three
> services provided by the kaserver), each serviced by completely 
> separate
> code.

Ok.  I discussed this some in my other email.  We should be able to add
another field to the default sockaddr, and create a new struct called
"sockaddr_rx" or "sockaddr_rxrpc" which includes the extra information
and is used for bind(), etc.  I don't suppose RxRPC or OpenAFS supports
IPv6 yet, does it?  With the new kernel API, that's one thing you would
get largely for free (Although maybe not, you probably pass around IPv4
addresses in the protocol fairly frequently. :-\)

> Similarly, a client needs to be able to control what service it is
> making a call to.  Presumably this means an extra field in the 
> sockaddr.

Yeah, I'll look at it some more.

> How are you handling aborts?  A server signals failure of an RPC by
> sending an abort packet containing a 32-bit error code.  So, it needs
> to be possible for a server to send an abort with a specified error
> code.  On the client side, attempts to read from an aborted call should
> return some sort of error, and the client should have a way of
> determining what the error code was.

I'll look at it a bit, but I think the standard allows us to either
return EIO and require that the user call recvmsg with a control blob to
get the data or perhaps just return a short read.

> An Rx call is essentially a short-lived stream-oriented data path, and
> both clients and servers need to be able to treat a call as a stream.
> So, it's best if in normal operation they can actually _use_ read and
> write, from a practical standpoint, rather than having to use
> sendmsg/recvmsg to do anything useful.

I plan to support readv, writev, sendmsg, and recvmsg.

> How are you handling call turnaround?  Clients in particular need to be
> able to signal that it is time to turn around the connection.

That will probably be done with a sendmsg with a particular control 
blob.
I still need to stare at it more, but I think that way we can do this:

write()
write()
write()
sendmsg()
read()
read()
read()

The sendmsg could optionally send a bit of data before reversing the
connection, I suppose.

One other tidbit of the Linux Rx layer, it's design is clean enough that
you should be able to test a preliminary TCP version without much new
code, if you want to.  From user-space, the only change for such a
system would be SOCK_RXRPC_STREAM vs. SOCK_RXRPC (Or should it be more
like SOCK_RXRPC_{STREAM,DGRAM}?)  Also, we'll support both IPv4 and
IPv6 in the initial implementation (And maybe unix sockets too) via

socket( AF_INET,  SOCK_RXRPC_* );
socket( AF_INET6, SOCK_RXRPC_* );
socket( AF_UNIX,  SOCK_RXRPC_* );

Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$ r  
!y?(-)
------END GEEK CODE BLOCK------