[OpenAFS] Performance issues with Git repositories (or in general with many small files workloads)

Ciprian Dorin Craciun ciprian.craciun@gmail.com
Thu, 17 Dec 2020 00:21:42 +0200

Hello all!

I'm trying to use AFS to backup various Git repositories.  By "backup"
actually mean `git push --mirror /afs/.cell/some-path/repository.git`,
which has the following behaviour:  it writes many small files in the
`.git/objects` folder fanned by the first two hex digits of the object

In fact this pattern can be found in many applications that handle
lots of small files.  For example `rsync`, build systems, etc.
Moreover the pattern I'm describing is single-threaded, as in these
files are not created concurrently by multiple threads / processes.

Unfortunately the performance is abysmal, I mean what should take
perhaps 1-2 seconds on a normal drive it takes perhaps up-to a minute
on AFS;  for example `git-push` reports an bandwidth of only ~20

Looking at the CPU usage, the `dafileserver` seems to be at ~95%,
although the system has 4 cores and is lightly used.

I can eliminate the following causes:
* network issues (both bandwidth or latency), because this behaviour
occurs even if I mount AFS on the same server where the file server
lives, thus everything happens over loopback;
* encryption -- it is off;
* synchronous close -- I've tried to set `fs storebehind -allfiles
16384 -verbose`;
* disks backing AFS cache -- it's a NVMe disk capable of ~3GiB/s;
* disks backing AFS file server -- it's a RAID5 of 3 top-of-the-line
(Gold) WD S-ATA drives;
* I can achieve good throughput for large files, or if accessing
medium sized files from multiple threads / processes;

My OpenAFS deployment is on Linux 5.3.18, OpenSUSE Leap 15.2, and the
following are the arguments of the file server and cache manager:

/usr/lib/openafs/dafileserver -syslog -sync onclose \
-p 128 -b 524288 -l 524288 -s 1048576 -vc 4096 \
-cb 1048576 -vhandle-max-cachesize 32768 \
-udpsize 67108864 -sendsize 67108864 \
-rxpck 4096 -rxmaxmtu 1400 -busyat 65536

/usr/sbin/afsd -blocks 67108864 -chunksize 17 -files 524288 \
-files_per_subdir 4096 -dcache 524288 \
-stat 524288 -volumes 4096 \
-splitcache 90/10 \
-afsdb -dynroot-sparse -fakestat-all \
-inumcalc md5 -backuptree \
-daemons 8 -rxmaxfrags 8 -rxmaxmtu 1400 \
-rxpck 4096 -nosettime

BTW, initially I was using the old `fileserver`-based setup, and
even though I've switched to `dafileserver` the performance seems to
stay unchanged.

Thanks for the help,