[OpenAFS] 1.4.8, Rx Performance Improvements, and a Small Business Innovative Research grant

Chas Williams (CONTRACTOR) chas@cmf.nrl.navy.mil
Mon, 06 Oct 2008 09:17:57 -0400


at one point, people talked about making rx work with xplot.  sometime
ago, i did this but didnt really get far enough to get something that
i would consider release quality.  however, given the interest i think
i will just release what i have.

ftp://ftp.cmf.nrl.navy.mil/pub/chas/afs/rxtrace2xplot.pl

there are some comments at the top about how to use this.  basically you
feed it the output from tcpdump and it breaks it up by call.  i believe
the converter still gets confused at times by the bidirectional nature
of rx.  the program attempts to print helpful diagnostics about the
sessions under trace.  i probably did not get everything right about
the rx protocol.  use the answers with a grain of salt.

anyway, i included some sample files in that directory as well.  the
sample shows activity from a client.  the client is apparently having
trouble because you can clearly see periodic retransmits due to a
missing ack.  i highlight these in red on the xplot graph.  see the
jpeg for a zoomed view.

it helps to take a look at some tcp transactions first to get an idea
of what xplot is trying to tell you.  but the rx view is basically the
same.  i try to draw the hard and soft ack windows, as i recall soft
acks are light gray.

watching a 'normal' rx connection is interesting.  the slow start is
pretty obvious.  something i notice is that afs rarely manages to 
fill its transmit window.

i could dig up some more sample traces if there is interest.

and one comment about this:

>Rainer Toebbicke wrote:
> 3. the path for handling an ACK packet is very long, I measured on the
> order of  10 microseconds on average on a modern processor. At over 100
> MB/s you'd be handling ~50000 ACKs per second in a non-jumbogram
> configuration and have hardly any time left to send out new packets. A
> lot is spent on waiting for the call-lock: even when that one is
> released quickly (which it isn't in the standard implementation, as the
> code leisurely walks around with it for extended periods, but I
> experimented with a "release" flag), the detour through the scheduler
> slows down things dramatically. The lock structure should probably be
> revisited to make contention between ack recv & transmit threads less
> likely;

afs' ack strategy is broken as designed.  rx can potentially ack every
other packet twice, soft and hard.  scanning/handling of the softack
field is just a killer.  we should just handle hard acks and use the
softack field to pad the following data fields to 'good' alignments.
i beleive this could be made to be backward compat with the existing
protocol without too much trouble.  stretch acks are a big win for tcp
and other 'enhanced' protocols.  this is also part of the reason why
jumbograms are a win.  they essentially stretch the ack interval since
only a single ack is sent for each jumbogram instead of each segment of
the jumbogram.