[OpenAFS] Advice on using BTRFS for vicep partitions on Linux

Ciprian Craciun ciprian.craciun@gmail.com
Wed, 22 Mar 2023 09:02:17 +0200


On Tue, Mar 21, 2023 at 9:32=E2=80=AFPM <spacefrogg-openafs@spacefrogg.net>=
 wrote:
> The main ingredient on BTRFS is to disable Copy-on-Write for the respecti=
ve. This also somewhat mitigates surprising out-of-space issues.


What is the reason behind disabling copy-on-write for BTRFS?  Does it
break OpenAFS in some way, or is it only the out-of-space issue?



> You need to provide the 'nodatacow' mount option.
> You lose data checksumming and compression on BTRFS. So, reasonable RAID =
config and scrubbing may be more important, now.


Unfortunately (at least for my use-case) losing the checksumming and
compression is a no-go, because these were exactly the features that
made BTRFS appealing versus Ext4.

Also, regarding RAID scrubbing, it doesn't cover the issue of
checksumming, because (for example with RAID5) it can only detect that
one of the disks has corrupted data, but couldn't say which.

(As an alternative to file-system provided checksumming, at list on
Linux, there is the `dm-integrity`, configured via `integritysetup`,
that could provide checksumming at the block level;  but at the moment
I'm still experimenting with it for other use-cases.)



> Additionally, depending on your exact setup, you may want to disable writ=
e barriers (e.g. for network attached storage, 'nobarrier') when it is with=
out effect.


Could you elaborate more on this?  I guess it doesn't apply to
directly attached disks.  Is this in order to increase write
performance, or?

Have you also changed the `-sync` file-server option?

I'm using `-sync onclose` to be sure that my data is actually stored
on the disk.  The write performance does suffer, especially for
use-cases like Git where some simple operations (like repacking) take
forever (because for some reason Git tries to touch each and every
`.git/objects/XX` folders...)



> Last remark. BTRFS, to my knowledge, does not support reservations. You M=
UST make sure to use a pre-allocated storage for the /vicepX mountpoint or =
the ugly day of failing AFS writes will come during your next overseas vaca=
tion.


You mean in the case `/vicepX` is a separate volume, but on the same
actual disk with other volumes, right?

(In my case I intend to use a dedicated BTRFS disk, over RAID, without
any subsolumes.)



> ZFS, although you don't want to go that way, works fine as well. Again, m=
ake sure to create a filesystem (i.e. subvolume) with a fixed reservation. =
AFAIK the FS takes care of providing enough space although you cannot disab=
le COW. You keep all the goodies, duplication, deduplication, checksumming.=
 I would suggest reading on ZFS setups for heavy database loads, should I h=
ave got you interested.


Thanks for the ZFS suggestions, however for me ZFS is a complete no-go
due to one main reason:  it's not in-kernel;  which means sooner or
later one would encounter problems.  The other reason is complexity:
I use OpenAFS for my own "self-hosted" / "home" needs, thus I want
something I can easily debug in case something goes wrong.  ZFS
doesn't give me much peace of mind;  too complex, too many options...

Thanks,
Ciprian.