[OpenAFS-devel] tuning underlying filesystems for afs

Horst Birthelmer horst@riback.net
Thu, 14 Oct 2004 23:40:55 +0200


On Oct 14, 2004, at 11:18 PM, Martin MOKREJ=8A wrote:

> Horst Birthelmer wrote:
>> On Oct 14, 2004, at 5:50 PM, Martin MOKREJ=8A wrote:
>>> Hi,
>>>  I'm installing new afs cell and configuring huge afs server - 1TB
>>> raid5 array on a dual-controller based on adapter U160. It has 6 GB=20=

>>> RAM,
>>> 2 xeon CPU's 3 GHz. It runs linux-2.4.28-pre3 kernel.
>>>
>>>  AFS has inode based and namei based fileservers. What is their=20
>>> difference
>>> in terms of performance?
>> There isn't any reason for installing an inode fileserver and Linux=20=

>> fileservers are namei by default.
>>  =46rom now on I'm referring to namei fileservers...
>
> OK. I was hoping inode based fileservers are faster, as they work
> on the lower level ... I take it as you say.
>
>>>  Filesystems are usually tuned for large or small files. What is the=20=

>>> case of
>>> fileserver? The /vicepX partitons are mostly filled with few small=20=

>>> files,
>>> which corresponde to volumes if I'm right. That has nothing to do=20
>>> with size of
>>> files stored in afs volumes, I know ... but should I tune for "huge"=20=

>>> or "small"
>>> files?
>>>
>> It has to do with the size of your files. What hasn't is your=20
>> directories but the files will definitely be there but not in any=20
>> directory you would recognize from where it is on your AFS client.
>> So if you have a lot of small files ... on your fileserver will be a=20=

>> lot of small files.
>> Your volumes and files get organized by the fileserver using an=20
>> algorithm documented based on some hashing but I think you don't want=20=

>> to know about it ;-)
>
> Yes and no. ;) ReiserFS uses r5, tea and rupasov algorithms. All of=20
> them
> make the filesystem fast but take cpu (sometimes completely). First of=20=

> all
> I'd prefer the server to be unloaded, as there will be some=20
> computational
> jobs too. Under load above 2 I guess filesystem performance would=20
> suffer
> under reiserfs, as the bottleneck already is cpu on an unloaded system
> as seen in my tests. I believe teh cause for such a cpu cunsumption is
> computation of those hashes or whatever they are. ;-)
>
> Second, I'm not sure if it makes sense to sort files inside fileserver
> and subsequently sort them within reiserfs/xfs. At least that's what I=20=

> think
> is happening.
>

It's not that the fileserver does something with your files. It just=20
creates a certain directory structure where it is able to find a file=20
in a reasonable amount of time based on the metainformation from the=20
system. (This means volume name, fid. etc)
The fileserver just uses the filesystem below. So if you'll have a lot=20=

of small files I think Reiser is fast because of the storage of the=20
actual data in trees. On rather big files it won't gain you anything=20
but consumed cpu time ... :-)

You can't avoid the calculation of the hashes in the filesystem as you=20=

can't avoid it in the fileserver anyway. You can just keep it in=20
bounds.
BTW, you don't need any logging file system (from my point of view).=20
It's just overhead. If your fileserver crashes during a save operation=20=

that file's gone anyway. If you reboot or restart that won't bring you=20=

that file back and the fileserver (and I do mean the piece of software=20=

here) starts with a salvager run anyway. So you won't really have any=20
big advantages from your logging. It's just nice to have ;-).
Maybe I'm starting a (religious) war here...

>>>  I expect to have several files above 1GB in afs volumes, in general=20=

>>> more
>>> huge files then small ones.
>>>
>>>  AFS/kernel mounts /vicepX partitions automatically? However, for=20
>>> example xfs offers
>>> several mount options, which affect performance. How can I make=20
>>> advantage of
>>> such options under afs?
>> AFS doesn't do anything by itself!!
>> Nobody will mount your partitions until you do that (on the server=20
>> side) and the parameters are in your hands as well. Tune as you want=20=

>> or need it.
>> The fileserver on startup will just look at directories named=20
>> /vicepXX. That's all...
>>>
>>>  I'm attaching my current results from bonnie++ tests. In general,=20=

>>> xfs is fast
>>> equally as reiserfs, except random operations. For random=20
>>> operations, reiserfs
>>> is the best, them comes ext2, ext3 and xfs as the last one. At least=20=

>>> if
>>> I interpret the numbers correctly.
>>>
>> I can't and won't interpret your decisions regarding the tuning.
>
> That's a pitty. But thanks anyway for nice reponse! ;)
>

It's all yours :-)) (the decision, I mean)

Horst