[OpenAFS] Advice regarding OpenAFS performance?

Andreas Breitfeld andreas@MPA-Garching.MPG.DE
Wed, 16 Aug 2023 09:51:36 +0200


Hi Collin,

in case network traffic encryption is enabled for your AFS client (check 
with "fs getcrypt") a huge performance improvement can be achieved by 
switching it off immediately after the client daemon starts, for example 
in init script with "/usr/afsws/bin/fs setcrypt off" (specify full path 
to "fs" command in your environment).

Thanks,
Andreas


On 15.08.23 00:29, Collin Gros wrote:
> Dear OpenAFS community,
> 
> We are administrators for an OpenAFS environment of (what will be) about 
> 400 users and are running into some performance issues, for which we 
> hope you might have some advice...
> 
> 1. Do you have any sources we can look at that might help us in 
> adjusting configuration to improve performance? We read the man page for 
> `dafileserver` and messed around a lot with our arguments to 
> `dafileserver` (increasing them past the values set for -L, or Large)... 
> though we haven't noticed much of an improvement in performance through 
> our testing. See below for the configuration we currently have set for 
> `dafileserver` on all of our OpenAFS file servers.
> 
> 2. Do you know what kind of read/write speed we should expect for an 
> enviroment/configuration of this size? It would be helpful for us to 
> know what we should be expecting in our environment as far as 
> performance is concerned.
> 
> ===========================
> 
> Our performance test
> 
> ===========================
> 
> Here are results from our testing with a binary file (7103053824 bytes 
> in size, or 6.7GB), copying it from one client to AFS:
> 
>    client1: openSUSE 15.1
> 
>    server: AFS file server that hosts the AFS volumes used for our testing
> 
>    `scp`: client1 (local) -> server (local): 102.2MB/s (66s)
> 
>    `cp`: client1 (local) -> client1 (AFS file space): 19.2MB/s (352s)
> 
>    `cp`: client1 (AFS file space) -> client1 (AFS file space): 19.46MB/s 
> (348s)
> 
> Here are results from our testing with the same binary file (7103053824 
> bytes in size, or 6.7GB), copying it in parallel from two clients to the 
> same AFS volume:
> 
>    client1 (local) -> server (AFS file space): 10.22MB/s (663s)
> 
>    client2 (local) -> server (AFS file space): 9.69MB/s (699s)
> 
>    client1 (AFS file space) -> client1 (AFS file space): 5.38MB/s (1258s)
> 
>    client2 (AFS file space) -> client2 (AFS file space): 7MB/s (965s)
> 
>    client1 (AFS file space) -> client1 (local): 13.15MB/s (515s)
> 
>    client2 (AFS file space) -> client2 (local): 15.57MB/s (435s)
> 
>    client1 total time taken: 2436s
> 
>    client2 total time taken: 2099s
> 
> Here is a snapshot of what `top` looks like from the AFS file server 
> while the copy is taking place:
> 
>    top - 16:14:14 up 5 days,  7:29,  2 users,  load average: 1.06, 0.37, 
> 0.26
> 
>    Tasks: 297 total,   2 running, 294 sleeping,   1 stopped,   0 zombie
> 
>    %Cpu0  : 17.3 us,  6.5 sy,  0.0 ni, 69.4 id,  1.7 wa,  1.0 hi,  4.1 
> si,  0.0 st
> 
>    %Cpu1  : 16.2 us,  4.1 sy,  0.0 ni, 65.5 id, 13.2 wa,  0.7 hi,  0.3 
> si,  0.0 st
> 
>    %Cpu2  :  5.0 us,  6.7 sy,  0.3 ni, 12.4 id, 63.2 wa,  1.0 hi, 11.4 
> si,  0.0 st
> 
>    %Cpu3  :  7.5 us,  5.1 sy,  9.2 ni, 44.2 id, 31.5 wa,  1.4 hi,  1.0 
> si,  0.0 st
> 
>    %Cpu4  : 13.3 us,  6.5 sy,  2.0 ni, 67.6 id,  9.9 wa,  0.7 hi,  0.0 
> si,  0.0 st
> 
>    %Cpu5  : 37.4 us, 14.6 sy,  0.0 ni, 41.1 id,  6.0 wa,  0.7 hi,  0.3 
> si,  0.0 st
> 
>    MiB Mem :  24080.5 total,  14283.7 free,    526.5 used,   9270.3 
> buff/cache
> 
>    MiB Swap:   4060.0 total,   4060.0 free,      0.0 used.  23105.9 
> avail Mem
> 
>        PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     
> TIME+ COMMAND
> 
>      22409 root      15  -5 4282356  65240   2808 S 118.3   0.3  
> 75:55.61 dafileserver
> 
> Here is the output of `fs getcacheparms` while both clients were copying 
> the file to AFS:
> 
>    client1: AFS using 781060 of the cache's available 891289 1K byte blocks.
> 
>    client2: AFS using 0 of the cache's available 891289 1K byte blocks.
> 
> ***************************
> 
> Our environment
> 
> ***************************
> 
> We have our environment configuration documented below, and are hoping 
> you might give us some pointers as to what might be a performance 
> bottleneck.
> 
>    Our testing environment:
> 
>      - OpenAFS Servers
> 
>        - OpenAFS 1.8.9
> 
>        - DB servers (total of 3)
> 
>          - 1 master
> 
>            - Rocky Linux 8.8
> 
>            - 2 CPU
> 
>            - 4GB RAM
> 
>          - 2 replicas, with each having:
> 
>            - Rocky Linux 8.8
> 
>            - 2 CPU
> 
>            - 4GB RAM
> 
>        - FS servers (total of 3)
> 
>          - 3 fileservers, with each having:
> 
>            - Rocky Linux 8.8
> 
>            - 6 CPU
> 
>            - 24GB RAM
> 
>            - /usr/afs/local/BosConfig:
> 
>                restrictmode 0
> 
>                restarttime 16 0 0 0 0
> 
>                checkbintime 3 0 5 0 0
> 
>                bnode dafs dafs 1
> 
>                parm /usr/afs/bin/dafileserver -L -cb 640000 
> -abortthreshold 0 -vc 1000
> 
>                parm /usr/afs/bin/davolserver -p 64 -log
> 
>                parm /usr/afs/bin/salvageserver
> 
>                parm /usr/afs/bin/dasalvager -parallel all32
> 
>                end
> 
>                bnode simple upclientetc 1
> 
>                parm /usr/afs/bin/upclient db1 /usr/afs/etc
> 
>                end
> 
>                bnode simple upclientbin 1
> 
>                parm /usr/afs/bin/upclient db1 /usr/afs/bin
> 
>                end
> 
>      - OpenAFS Clients
> 
>        - client1
> 
>          - openSUSE 15.1
> 
>          - OpenAFS 1.8.7
> 
>          - 6 CPUs
> 
>          - 16GB RAM
> 
>          - `fs getcacheparms`
> 
>              AFS using 12 of the cache's available 891289 1K byte blocks.
> 
>          - /etc/sysconfig/openafs-client:
> 
>              AFSD_ARGS="-fakestat -stat 6000 -dcache 6000 -daemons 6 
> -volumes 256 -files 50000 -chunksize 17"
> 
>        - client2
> 
>          - openSUSE 13.2
> 
>          - OpenAFS 1.8.7
> 
>          - 2 CPUs
> 
>          - 2GB RAM
> 
>          - `fs getcacheparms`
> 
>              AFS using 0 of the cache's available 891289 1K byte blocks.
> 
>          - /etc/sysconfig/afs
> 
>              OPTIONS=$XXLARGE
> 
>                (and XXLARGE="-fakestat -stat 4000 -dcache 4000 -daemons 
> 6 -volumes 256 -afsdb")
> 
> Thanks for the help!!
> 
> Regards,
> 
> Collin
> 
> *Collin Gros*
> 
> *Staff Software Engineer*
> 
> *RICOH Graphic Communications - DSBC*
> 
> *Ricoh USA, Inc*
> Phone: +1 720-663-3225
> 
> Email: collin.gros@ricoh-usa.com
>