[OpenAFS] Doubts about OpenAFS implementation in a company

Michael Meffie mmeffie@sinenomine.net
Wed, 18 May 2011 11:19:48 -0400

Hartmut Reuter wrote:
> Stanis=C5=82aw Kami=C5=84ski wrote:
>> First of all, hi to everyone - it's my first own topic here :-)
>> I'm working for a company ~1000 ppl, three offices in Poland and three
>> other in bordering countries. OpenAFS was introduced about 6 years ago=
>> when the company was quite a bit smaller, and the guy that did this le=
>> no documentation and some of his design decision are making me scratch
>> my head - that's part of the reason I'm writing this.
>> Other things that are important:
>> - about 2/3 of users work on Linux (CentOS) workstations, and their
>> homedirs are served from AFS
>> - 1/3 are Windows users
>> - Polish offices are connected using at least 10 Mbit symmetric links,
>> but the offices abroad might have much less. In one particular example=
>> the link is assymmetric 10/1 Mbit (d/u)
>> - there is single AFS cell covering all the offices
>> - every office has it's own db and fileserver (Debian 5/6)
>> - we rely on our partner to assign IP address space for us - net resul=
>> is that the weakest link location (10/1) has the lowest IP and there
>> _nothing_ we can do about it
>> The last thing causes Ubik elections to constantly choose the server
>> located on the weakest link as sync site.
> This can be changed by making the slow database server with lowest ip-a=
ddress a=20
> "clone". Aclone never can become sync site and his votes do not count.
> Use "bos removehost <dbserver> <slowdbserver>" for all dbservers and th=
> "bos addhost <debserver> <slowdbserver> -clone" on all dbservers and re=
> everywhere the database instances.

As Hartmut says, cloned sites can help if you want to force a particular
site to always be a secondary database site? How many db servers do you
have in your cell? Are they multihomed?

>> Also, we quite often have to move user volumes between different offic=
>> - we've got quite a bit of rotation between them, say some 10-20 ppl p=
>> week.

That should be ok, if you have fileservers which are close to those offic=
One thing AFS is very good at is moving volumes around to different files=

>> Now, I've been assigned to improve AFS performance in any way possible=
>> It was very bad, then I changed server parameters to tune it to "large=
>> server options - that yield enormous speedup, but I still believe I ca=
>> get much more from the system.

Even "large" might not be large enough. One of the first thinks to check
is the number of callbacks needed for your fileserver.  Do you have the
xstat_fs_test tool installed? If so, it can report callback statistics
by running xstat_fs_test $host -c 3 -once