[OpenAFS] Re: Expected performance
Thu, 19 May 2011 14:57:16 -0400 (EDT)
On 2011-05-19 at 13:25, Andrew Deason ( firstname.lastname@example.org ) said:
> On Tue, 17 May 2011 23:14:03 +0100
> Hugo Monteiro <email@example.com> wrote:
>> - Low performance and high discrepancy between test results
>> Transfer rates (only a few) hardly touched 30MB/s between the server and
>> a client sitting on the same network, connected via GB ethernet. Most of
>> the times that transfer rate is around 20MB/s, falling down to 13 or
>> 14MB/s in some cases.
> The client and server configs would help. I'm not used to looking at
> single-client performance, but... assuming you're using a disk cache,
> keep in mind the data is written twice: once to the cache and once on
> the server. So, especially when you're running the client and server on
> the same machine, there's no way you're going to reach the theoretical
> 110M/s of the disk.
You can certainly get close if your disk for the disk cache is fast
enough. I've seen close to 80MB/s with 15K SAS under ideal conditions.
Re: client and server on the same machine - I've seen that actually result
in lower performance. When you take the physical network out of the mix,
Rx starts limiting you as a function of CPU usage it seems.
> You may want to see what you get with memcache (or if you want to try a
> 1.6 client, cache bypass) and a higher chunksize. Just running dd on a
> box I have, running a 1.4 afsd with -memcache -chunksize 24 made it jump
> from the low 20s to high 40s/low 50s (M/s), after starting with the
> defaults for a 100M disk cache.
Just to add some more data points...
I recently saw peaks of 90M/s for memcache for single client writes. Reads
from memcache can be as fast as your memory is, so upwards of a couple
In general, 1.6 memcache > 1.4 memcache > 1.6 diskcache > 1.4 diskcache.
1.6 disk cache uses a LOT less CPU than 1.4 disk cache, however. Nice for
processes that need IO and CPU at the same time on a machine that might
already be lacking CPU.
Options I used to get those numbers with 1.6.0pre5:
-dynroot -fakestat -afsdb -nosettime -stat 48000 -daemons 12 -volumes 512 -memcache -blocks 655360 -chunksize 19
-p 128 -busyat 600 -rxpck 4096 -s 10000 -l 1200 -cb 1000000 -b 240 -vc 1200 -abortthreshold 0 -udpsize 1048576
Server in this case is a very new 16-core Opteron box with 32GB of RAM (it
runs multiple fileserver instances under Solaris zones). Client is a
relatively new 8-core Opteron box with 64GB of memory.
Also in general, client performance seems to get worse the more CPUs you
have. Our 48-core boxes tend to get lower numbers than our smaller 16 and
8 core boxes. I haven't done too many comparison tests to really quantify
how much of a difference that makes, though.
Cache bypass definitely makes things faster for things that aren't cached,
though I will withold performance numbers for that as I was testing bypass
inside a ESX vm (one of our webservers), but within the same machine, it
got similar numbers to disk cache after the files had been cached (where
disk cache is a raw FC LUN)
Under normal conditions with fairly modern hardware, you
should expect 50M/s with some simple tuning (-chunksize mostly, and
-memcache if your machine has the memory to spare).
I haven't done any testing for the multi-client case, as that's slightly
more difficult to properly test while holding everything else constant. By
multi-client, I mean multiple actual cache managers involved as well as
multiple users behind the same cache manager.