[OpenAFS] About AFS performance over WAN
Rainer Toebbicke
rtb@pclella.cern.ch
Mon, 1 Dec 2008 10:24:34 +0100
Giovanni Bracco schrieb:
> I know that this is a well know problem of the rx protocol, as shown for
> example by Hartmut Reuter at the last European AFS Conference 2008
> (see slide 49 from
> http://www.openafs.at/drupal/files/slides/1Day_03/AFS-OSD.pdf),
> due to the fixed rx window size and combined with network latencies in the
> order of tenths of milliseconds.
>
> I am aware that an activity was in progress for a tcp version of openafs,
> which probably could solve some of this problem, but I do not know what is
> the status of this activity.
> More generally, what are the plans to increase the AFS performances over WAN,
> to take advantage of the present day availability of high bandwith
> connections?
>
What RX-over-TCP would bring you is to copy all the improvements that went
into TCP over the past decade of research into RX. And it will make things
more familiar for network administrators dealing mainly with TCP
considerations. What it will not bring you is a bulk transfer protocol.
AFS transfers files chunk-wise, while there is read-ahead the transfer is
essentially still sequential. Due to the RPC nature of the protocol you will
have a stop every 64K (or 256K, or whatever you typically set it to). A plain
port to TCP will not change anything there, worse such a start-stop in a
single TCP stream could very well challenge the sophisticated techniques that
went into window heuristics and congestion control.
With unlimited development resources AFS would deserve a better suited
protocol than TCP, in practice with a little more realism my gut feeling is
that at least some more brain should be devoted to improving plain RX rather
than betting on another horse. I occasionally tried over the past years, with
some improvements that Hartmut tested as well, but my brain being what it is
and the matter relatively complicated results remain modest.
High latency remains a fierce enemy. Some address it through pre-fetches which
are a double-sided sword! For read, if the file system had reliable knowledge
about big files (or series of files) to be transferred in their entirety AFS
could relatively easily be modified to start chunk pre-fetches in parallel,
slightly shifted in time, over standard RX, solving the start-stop
problematic. The key here is to do this only if you're sure you're not
over-speculating and throwing away most of it soon after.
For writes, here at CERN we already run with mods that start chunk
transmission early while the file is still being written to. Naively thinking
that would be vastly easier to improve given that much more is known!
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Rainer Toebbicke
European Laboratory for Particle Physics(CERN) - Geneva, Switzerland
Phone: +41 22 767 8985 Fax: +41 22 767 7155