[OpenAFS] Possible explanation(s) for obvious performance problems?

Simon Wilkinson sxw@inf.ed.ac.uk
Sat, 24 Apr 2010 20:20:37 +0100

On 24 Apr 2010, at 19:00, Holger Rauch wrote:
> (Both dd commands were run directly on the file server host in order
> to rule out possible network latency problems as a cause for the bad
> performance).

So, it's very important to realise that this is a terrible comparison. =
Whilst both filesystems ultimately result in the data hitting an ext3 =
filesystem, OpenAFS is doing significantly more work.

With the ext3 case, the dd calls the write() syscall, the data gets =
copied from user into kernel space, the kernel marks the page as dirty, =
and at a moment of its choosing flushes that page to disk. Job done.

With OpenAFS, dd calls write(), and the data gets copied into the =
kernel. The OpenAFS kernel module then copies the page to the local disk =
cache (by calling ext3's write), and returns control to the user. When =
dd completes it calls close. The kernel module then loads the data back =
from the disk cache (by calling ext3's read, if the data has been paged =
out), and converts it into a set of arguments to an RPC call. It figures =
out where to deliver that RPC to, possibly by making network calls to =
the vlserver. The RPC is then checksummed, encrypted, split up into =
appropriately addressed UDP packets and passed to the kernel's =
networking stack. The networking drivers then route the packets =
(hopefully round the loopback interface, but that does depend) and =
deliver them to the fileserver. This runs in userspace, so the data gets =
copied out of the kernel into the fileserver's buffers. The file server =
then decrypts the data, decodes the RPC arguments to get the data being =
written, works out which file it corresponds to, and whether it needs to =
notify anyone that that file has changed. It then calls ext3's write() =
syscall which copies the data from user into kernel space, the kernel =
marks that page as dirty, and eventually flushes it to disk. Finally, =
we're done.

With a level playing field, a directly connected local disk is always be =
faster than a network filesystem - there's simply less work to be done =
when throwing data straight onto a local disk than there is when you =
send it across the network.=20

> Any ideas as to where that bad performance might come from? (I do have
> encryption enabled, but since it's only plain DES encryption on
> current machines that most likely can't be an explanation for the
> performance problem).

The encryption isn't DES, it's fcrypt. And encryption does have a =
significant effect on performance (not only doing the encryption itself, =
but also the number of additional copies that it adds in to the data =

It's worth taking a look at the configuration of your fileserver. =
There's been a lot written here in the past about the ideal settings for =
the fileserver - I'll leave you to Google over the list, but it's just =
worth noting that the out of the box configuration is not likely to =
result in good performance.

As a datapoint, using your test across a network, my homedirectory is =
seeing write speeds of 70 MB/s from an OpenAFS 1.4.11 client. We're =
currently running our fileservers with "-L -p 128 -rxpck 400 -busyat 600 =
-s 1200 -l 1200 -cb 100000 -b 240 -vc 1200"

All that said, we are trying to improve the performance of OpenAFS. =
There are numerous changes in the 1.5 tree, particularly for Linux, that =
help speed it up. I'd also encourage you to look at more meaningful =
benchmarks - in particular those which mirror the kind of use you'll =
actually be putting the filesystem to.