[OpenAFS-devel] Progress on Linux in-kernel RxRPC library

Tue, 22 Mar 2005 20:13:03 -0500

On Tuesday, March 22, 2005 07:35:11 PM -0500 Kyle Moffett 
<mrmacman_g4@mac.com> wrote:

> In any case, the hack would be to create a new key
> with a different ID but the same data.  That would force the kernel
> Rx layer to use a new connection for anything using that key.

That would work, but it is indeed a hack.  I think there's a better answer, 
at least for the common case.  The problem here is that fileservers track 
things about their clients like callback state and capabilities.  The 
fileserver assumes that all connections coming from the same host and port 
belong to the same cache manager, and thus share the same cached state.  It 
can tell when a client changes addresses, or has more than one address, by 
checking UUID's, but the invariant must still hold -- active connections 
from the same address/port must belong to the same client.

Now, it turns out that fileserver clients must implement the callback 
interface, even if they do no caching.  Further, the callback interface 
must run on the same address and port that the outgoing calls originate 
from.  In the current mode, that means the client receives incoming calls 
on the same UDP socket as its outgoing calls, and in most cases, actually 
binds that socket to a specific port before originating any outgoing 
traffic.

So, in order to be a fileserver client using your sockets interface, I need 
to be able to establish a listen socket and then make outgoing calls that 
will originate from the same port that I'm listening on.  Presumably this 
would be done by calling getsockname() on the listen socket, and then 
binding the socket for the outgoing call to the same port before calling 
connect or sendto or whatever.

So here comes the fun part...
Just as with other transports, the Rx sockets interface should not allow 
two distinct processes to bind the same port(*).  Calls made via a socket 
bound to a specific port must actually originate from that port, so it 
stands to reason that calls made via two different sockets bound to 
different ports will not share connections.  So the problem is _almost_ 
solved.  It just requires one additional constraint, which is that a call 
made via an unbound socket must not share a connection with a call made via 
a bound socket.

Obviously, calls made via different unbound sockets can share connections, 
provided the credentials are the same.  And, calls made via sockets bound 
to the same port can also share connections.  Since the cache manager will 
bind its port, no user-mode process will be able to share connections with 
it, and everything works out.

(*) There is one exception: for full generality, it might be desirable to 
allow two servers running in different processes to bind to different Rx 
services running on the same port.  If that is done, then we'd have to come 
up with a rule to determine when that is allow, and who (if anyone) gets to 
make calls from that port.  There is nothing in AFS today that requires 
this capability.

>> And it's almost certainly _not_ a good idea for user-mode
>> processes to share connections with the cache manager, because the
>> fileserver tracks information about clients, including their
>> capabilities and callback state, partly by connection.
>
> Hrm, that really would seem to be bug-like, although if it keeps
> track of capabilities by connection it would be nearly impossible to
> work around.  I don't suppose there is any way to have the
> fileserver keep track of such data any other way?

It's not so much by connection as by endpoint, as I described above.  The 
key point is that calls being made via the same connection are assumed to 
be from the same process, which I think is a reasonable assumption.

> Well, other models are:
> Create a thread to handle a specific connection from a specific
> user, including all the RPCs they ever need to make. Or, create a
> bunch of worker threads beforehand and just have them switch keys
> whenever they need to change which user they act as.

Yeah.  I dunno; forcing what I see as a parameter into what is essentially 
(thread-)global state just seems wrong to me.

>> Yeah.  This is essentially what Derrick's (as yet unreleased)
>> code does. I'm sorry if I was unclear here -- it's not that
>> automatic connection tracking won't _work_; it just doesn't save
>> the cache manager from figuring out what PAG a process is in.
>
> Ah, ok, I think the confusion here is that the "user" identity
> would no longer be called a "PAG" by the Linux kernel, it would
> instead be called an "afstoken" key id.

Well, the Linux kernel already doesn't call it a PAG :-)
But yeah, we treat it the same, and just call it something different.

> It provides these kinds of functions for kernel-space users:
>
> There are more, and I haven't tinkered with my quick grep&sed&awk
> script to give function arguments, but you should get the idea.

Yeah.

> Hmm, ok, we'll have another go at it.  We're currently rebuilding
> our setup on new servers in a highly-available configuration
> (As per my messages on the Openafs-info list).  We'll try using
> OpenAFS 1.3 on Linux 2.6 for a bit, and see how that works.  The
> one part I have the hardest time with is getting decent OpenAFS
> 1.3 packages for Debian, but I think I may have those issues
> resolved now.  We'll try it and see.

Great.

-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
   Sr. Research Systems Programmer
   School of Computer Science - Research Computing Facility
   Carnegie Mellon University - Pittsburgh, PA