[OpenAFS] Server disk operations speed

jukka.tuominen@finndesign.fi jukka.tuominen@finndesign.fi
Mon, 8 Apr 2013 21:09:43 +0300 (EEST)


Hi Jeffrey,

these numbers are very helpful, as well as the list of performance
factors, thank you!

To me, these seem significantly faster: 2-20 x faster. I appreciate that
the size and amount of files make a big difference, but I still think
there must be something wrong with my configuration for not getting
anywhere close to these numbers.

The flushed/ not flushed numbers just let me thinking, that - not
understanding the depths of afs - isn't the afs cache meant to be always
in sync with the server, or is it possible (default?) for the applications
to let go the data before it is confirmed to exist on the server? That is,
if my configuration somehow forces the sync against the default, that
could explain the poor performance. Sorry for asking stupid guestions...

br, jukka



> If you want to have a consistent benchmark with other installations
> you need to provide a scripted set of operations using a common dataset
> to measure.  Selecting a random source of files to copy I chose my
> openafs .git directory tree.
>
> Using robocopy on a Windows client to a MacMini running OSX 10.6.8
> Server and OpenAFS 1.6.2 using iSCSI RAID6 EXT4 storage for the vice
> partition I obtain transfer rates of:
>
> Copy to AFS:
>
>                Total    Copied   Skipped  Mismatch    FAILED    Extras
>     Dirs :       274       273         1         0         0         0
>    Files :      2720      2720         0         0         0         0
>    Bytes :   77.53 m   77.53 m         0         0         0         0
>    Times :   0:00:51   0:00:38                       0:00:00   0:00:13
>
>
>    Speed :             2127901 Bytes/sec.
>    Speed :             121.759 MegaBytes/min.
>
> Copy from AFS (cached):
>
>               Total    Copied   Skipped  Mismatch    FAILED    Extras
>     Dirs :       274       273         1         0         0         0
>    Files :      2720      2720         0         0         0         0
>    Bytes :   77.53 m   77.53 m         0         0         0         0
>    Times :   0:00:25   0:00:19                       0:00:00   0:00:05
>
>
>    Speed :             4092669 Bytes/sec.
>    Speed :             234.184 MegaBytes/min.
>
> Copy from AFS (not cached):
>
>                Total    Copied   Skipped  Mismatch    FAILED    Extras
>     Dirs :       274       273         1         0         0         0
>    Files :      2720      2720         0         0         0         0
>    Bytes :   77.53 m   77.53 m         0         0         0         0
>    Times :   0:00:34   0:00:28                       0:00:00   0:00:05
>
>
>    Speed :             2875507 Bytes/sec.
>    Speed :             164.537 MegaBytes/min.
>
>
> The data set contains one 64MB file, one 5MB file and the other ~8MB
> spread across 2718 files and 273 directories.
>
> The time to copy a single 200MB file is as follows:
>
> Write to AFS:
> timer on
> Timer 1 on: 13:17:04
> copy d:\random.200mb
> D:\random.200mb => \\afs\yfs\project\test\test1\random.200mb
>      1 file copied
> echo 18.8501413761 MB/secs
> 18.8501413761 MB/secs
> timer off
> Timer 1 off: 13:17:14  Elapsed: 0:00:10.61
>
> Read from AFS (no flush):
> timer on
> Timer 1 on: 13:17:14
> copy random.200mb d:\random.200mb
> \\afs\yfs\project\test\test1\random.200mb => D:\random.200mb
>      1 file copied
> echo 416.6666666667 MB/secs
> 416.6666666667 MB/secs
> timer off
> Timer 1 off: 13:17:15  Elapsed: 0:00:00.48
>
> Read from AFS (after flush):
> timer on
> Timer 1 on: 13:17:15
> copy random.200mb d:\random.200mb
> \\afs\yfs\project\test\test1\random.200mb => D:\random.200mb
>      1 file copied
> echo 24.1837968561 MB/secs
> 24.1837968561 MB/secs
> timer off
> Timer 1 off: 13:17:23  Elapsed: 0:00:08.27
>
> As others have mentioned:
>
>  1. when there are large numbers of small files the cost of
>     the RPC overhead including network latency far surpasses the
>     cost of sending the file data.
>
>  2. data encryption (all of the above numbers are with encryption)
>     comes at a performance cost.
>
>  3. virtualization of I/O (network and disk) is expensive.
>     if the VM I/O is competing with the hypervisor, other VMs,
>     and the host system for disk and network cycles there will
>     be large increases in the latency associated with each request.
>
>  4. vice partition file system makes a difference
>
>  5. file system journals makes a difference
>
>  6. raid makes a difference
>
>  7. there are many other potential variables
>
> I hope these numbers are helpful.
>
> Jeffrey Altman
>
>
> On 4/8/2013 12:11 PM, jukka.tuominen@finndesign.fi wrote:
>>
>> Hi all,
>>
>> thank you for your responses. Before going through them in detail, I
>> would
>> just like to make a reality check. What kind of performance figures
>> should
>> one expect from an averagely working afs network (LAN/WAN)? Say, if you
>> would duplicate roughly the 2000 files/300MB directory in your setup,
>> what
>> kind of rates do you get? That is, is 500-1000KB/s a reasonable starting
>> point for optimization, not a magnitude or two higher?
>>
>> I actually created a simple genetic-algorithm program to optimize
>> fileserver parameters. Eventhough I didn't run it to full length, I'm
>> pretty confident I couldn't achieve much higher rates just by
>> parameterizing the fileserver.
>>
>> I'm trying to get the performance good for generic, mixed use; some user
>> may work with large individual files, another one with large amount of
>> small files. It shouldn't be optimized to either extreme.
>>
>> br, jukka
>>
>>
>>
>> _______________________________________________
>> OpenAFS-info mailing list
>> OpenAFS-info@openafs.org
>> https://lists.openafs.org/mailman/listinfo/openafs-info
>>
>
>