[OpenAFS] Advice regarding OpenAFS performance?
Andreas Breitfeld
andreas@MPA-Garching.MPG.DE
Wed, 16 Aug 2023 09:51:36 +0200
Hi Collin,
in case network traffic encryption is enabled for your AFS client (check
with "fs getcrypt") a huge performance improvement can be achieved by
switching it off immediately after the client daemon starts, for example
in init script with "/usr/afsws/bin/fs setcrypt off" (specify full path
to "fs" command in your environment).
Thanks,
Andreas
On 15.08.23 00:29, Collin Gros wrote:
> Dear OpenAFS community,
>
> We are administrators for an OpenAFS environment of (what will be) about
> 400 users and are running into some performance issues, for which we
> hope you might have some advice...
>
> 1. Do you have any sources we can look at that might help us in
> adjusting configuration to improve performance? We read the man page for
> `dafileserver` and messed around a lot with our arguments to
> `dafileserver` (increasing them past the values set for -L, or Large)...
> though we haven't noticed much of an improvement in performance through
> our testing. See below for the configuration we currently have set for
> `dafileserver` on all of our OpenAFS file servers.
>
> 2. Do you know what kind of read/write speed we should expect for an
> enviroment/configuration of this size? It would be helpful for us to
> know what we should be expecting in our environment as far as
> performance is concerned.
>
> ===========================
>
> Our performance test
>
> ===========================
>
> Here are results from our testing with a binary file (7103053824 bytes
> in size, or 6.7GB), copying it from one client to AFS:
>
> client1: openSUSE 15.1
>
> server: AFS file server that hosts the AFS volumes used for our testing
>
> `scp`: client1 (local) -> server (local): 102.2MB/s (66s)
>
> `cp`: client1 (local) -> client1 (AFS file space): 19.2MB/s (352s)
>
> `cp`: client1 (AFS file space) -> client1 (AFS file space): 19.46MB/s
> (348s)
>
> Here are results from our testing with the same binary file (7103053824
> bytes in size, or 6.7GB), copying it in parallel from two clients to the
> same AFS volume:
>
> client1 (local) -> server (AFS file space): 10.22MB/s (663s)
>
> client2 (local) -> server (AFS file space): 9.69MB/s (699s)
>
> client1 (AFS file space) -> client1 (AFS file space): 5.38MB/s (1258s)
>
> client2 (AFS file space) -> client2 (AFS file space): 7MB/s (965s)
>
> client1 (AFS file space) -> client1 (local): 13.15MB/s (515s)
>
> client2 (AFS file space) -> client2 (local): 15.57MB/s (435s)
>
> client1 total time taken: 2436s
>
> client2 total time taken: 2099s
>
> Here is a snapshot of what `top` looks like from the AFS file server
> while the copy is taking place:
>
> top - 16:14:14 up 5 days, 7:29, 2 users, load average: 1.06, 0.37,
> 0.26
>
> Tasks: 297 total, 2 running, 294 sleeping, 1 stopped, 0 zombie
>
> %Cpu0 : 17.3 us, 6.5 sy, 0.0 ni, 69.4 id, 1.7 wa, 1.0 hi, 4.1
> si, 0.0 st
>
> %Cpu1 : 16.2 us, 4.1 sy, 0.0 ni, 65.5 id, 13.2 wa, 0.7 hi, 0.3
> si, 0.0 st
>
> %Cpu2 : 5.0 us, 6.7 sy, 0.3 ni, 12.4 id, 63.2 wa, 1.0 hi, 11.4
> si, 0.0 st
>
> %Cpu3 : 7.5 us, 5.1 sy, 9.2 ni, 44.2 id, 31.5 wa, 1.4 hi, 1.0
> si, 0.0 st
>
> %Cpu4 : 13.3 us, 6.5 sy, 2.0 ni, 67.6 id, 9.9 wa, 0.7 hi, 0.0
> si, 0.0 st
>
> %Cpu5 : 37.4 us, 14.6 sy, 0.0 ni, 41.1 id, 6.0 wa, 0.7 hi, 0.3
> si, 0.0 st
>
> MiB Mem : 24080.5 total, 14283.7 free, 526.5 used, 9270.3
> buff/cache
>
> MiB Swap: 4060.0 total, 4060.0 free, 0.0 used. 23105.9
> avail Mem
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM
> TIME+ COMMAND
>
> 22409 root 15 -5 4282356 65240 2808 S 118.3 0.3
> 75:55.61 dafileserver
>
> Here is the output of `fs getcacheparms` while both clients were copying
> the file to AFS:
>
> client1: AFS using 781060 of the cache's available 891289 1K byte blocks.
>
> client2: AFS using 0 of the cache's available 891289 1K byte blocks.
>
> ***************************
>
> Our environment
>
> ***************************
>
> We have our environment configuration documented below, and are hoping
> you might give us some pointers as to what might be a performance
> bottleneck.
>
> Our testing environment:
>
> - OpenAFS Servers
>
> - OpenAFS 1.8.9
>
> - DB servers (total of 3)
>
> - 1 master
>
> - Rocky Linux 8.8
>
> - 2 CPU
>
> - 4GB RAM
>
> - 2 replicas, with each having:
>
> - Rocky Linux 8.8
>
> - 2 CPU
>
> - 4GB RAM
>
> - FS servers (total of 3)
>
> - 3 fileservers, with each having:
>
> - Rocky Linux 8.8
>
> - 6 CPU
>
> - 24GB RAM
>
> - /usr/afs/local/BosConfig:
>
> restrictmode 0
>
> restarttime 16 0 0 0 0
>
> checkbintime 3 0 5 0 0
>
> bnode dafs dafs 1
>
> parm /usr/afs/bin/dafileserver -L -cb 640000
> -abortthreshold 0 -vc 1000
>
> parm /usr/afs/bin/davolserver -p 64 -log
>
> parm /usr/afs/bin/salvageserver
>
> parm /usr/afs/bin/dasalvager -parallel all32
>
> end
>
> bnode simple upclientetc 1
>
> parm /usr/afs/bin/upclient db1 /usr/afs/etc
>
> end
>
> bnode simple upclientbin 1
>
> parm /usr/afs/bin/upclient db1 /usr/afs/bin
>
> end
>
> - OpenAFS Clients
>
> - client1
>
> - openSUSE 15.1
>
> - OpenAFS 1.8.7
>
> - 6 CPUs
>
> - 16GB RAM
>
> - `fs getcacheparms`
>
> AFS using 12 of the cache's available 891289 1K byte blocks.
>
> - /etc/sysconfig/openafs-client:
>
> AFSD_ARGS="-fakestat -stat 6000 -dcache 6000 -daemons 6
> -volumes 256 -files 50000 -chunksize 17"
>
> - client2
>
> - openSUSE 13.2
>
> - OpenAFS 1.8.7
>
> - 2 CPUs
>
> - 2GB RAM
>
> - `fs getcacheparms`
>
> AFS using 0 of the cache's available 891289 1K byte blocks.
>
> - /etc/sysconfig/afs
>
> OPTIONS=$XXLARGE
>
> (and XXLARGE="-fakestat -stat 4000 -dcache 4000 -daemons
> 6 -volumes 256 -afsdb")
>
> Thanks for the help!!
>
> Regards,
>
> Collin
>
> *Collin Gros*
>
> *Staff Software Engineer*
>
> *RICOH Graphic Communications - DSBC*
>
> *Ricoh USA, Inc*
> Phone: +1 720-663-3225
>
> Email: collin.gros@ricoh-usa.com
>