[OpenAFS-devel] Rx over TCP to solve some NAT & Firewall issues?

Dean Anderson dean@av8.com
Thu, 20 Nov 2003 16:21:33 -0500 (EST)


On Thu, 20 Nov 2003, Nickolai Zeldovich wrote:

> >                                            apparently jumbograms are
> > the way they are because people wanted a form of congestion control on
> > afs (controlling number of rx datagrams in a packet).
>
> Rx already has congestion control -- quite similar to TCP Reno with SACK.
> It has slow-start, AIMD and fast-recovery.  It doesn't seem to have fast
> retransmit, because it still seems to make the assumption that packets can
> get reordered.  Maybe we should fix this -- it should be quite simple.  Rx
> already has a SACK-like ack packet.

Err, I'm not following you above. Packets _can_ be reordered, especially
on a wide area network, so I don't see what the problem there is with
'still making this assumption'.

> One problem is that currently the window size is limited to 32 packets,
> which is 32*1444=46k of bandwidth-delay product.  That means I can only
> get ~500KB/sec throughput from east coast to west coast.  This problem is
> easy to fix by bumping up the max sender/receiver windows, but that's not
> the problem affecting performance in local networks.

Agreed.  The window size could be scaled up.

> I don't believe that Rx over TCP would have to keep more state than Rx
> over UDP.  After all, Rx over UDP pretty much keeps a TCP-like connection
> state structure in memory in userspace.  One could argue that TCP state is
> in unpageable kernel memory, but I think if your server is paging, you've
> lost already.
>
> As for the persistence of TCP connections, one could quite easily define
> them to be garbage-collectable at any time by either the server or the
> client, just like Rx over UDP connections are now.  If the server thinks
> it has too many connections open, it'll close idle client connections.
>
> Do people really think that Rx over UDP, designed 15 years ago, can be a
> better reliable stream transport than the TCP in today's kernels?  What
> features of Rx over UDP are so unique that preclude the use of TCP?

Socket setup time is much shorter with UDP. Cleanup time is also much
shorter.

If we were to make major RPC changes, I'd suggest that we try to move to
DCE rpc, which is quite a lot smarter, and for which there are better
security alternatives, such as SSL authentication. If only we can get the
last DCE vendor to sign off on putting DCE under the GPL...  Of course,
then we could think about moving to DFS, too.

There are probably other RPC alternatives, even to the extent of using
CORBA or something.  Getting RX more widely used, and in a separate
distribution, may encourage performance enhancements that would benefit
openafs.

		--Dean