[OpenAFS] Doubts about OpenAFS implementation in a company

Hartmut Reuter reuter@rzg.mpg.de
Wed, 18 May 2011 13:23:15 +0200


Stanis=C5=82aw Kami=C5=84ski wrote:
> First of all, hi to everyone - it's my first own topic here :-)
>
> I'm working for a company ~1000 ppl, three offices in Poland and three
> other in bordering countries. OpenAFS was introduced about 6 years ago,
> when the company was quite a bit smaller, and the guy that did this lef=
t
> no documentation and some of his design decision are making me scratch
> my head - that's part of the reason I'm writing this.
>
> Other things that are important:
> - about 2/3 of users work on Linux (CentOS) workstations, and their
> homedirs are served from AFS
> - 1/3 are Windows users
> - Polish offices are connected using at least 10 Mbit symmetric links,
> but the offices abroad might have much less. In one particular example,
> the link is assymmetric 10/1 Mbit (d/u)
> - there is single AFS cell covering all the offices
> - every office has it's own db and fileserver (Debian 5/6)
> - we rely on our partner to assign IP address space for us - net result
> is that the weakest link location (10/1) has the lowest IP and there
> _nothing_ we can do about it
>
> The last thing causes Ubik elections to constantly choose the server
> located on the weakest link as sync site.

This can be changed by making the slow database server with lowest ip-add=
ress a=20
"clone". Aclone never can become sync site and his votes do not count.
Use "bos removehost <dbserver> <slowdbserver>" for all dbservers and then
"bos addhost <debserver> <slowdbserver> -clone" on all dbservers and rest=
art=20
everywhere the database instances.

>
> Also, we quite often have to move user volumes between different office=
s
> - we've got quite a bit of rotation between them, say some 10-20 ppl pe=
r
> week.
>
> Now, I've been assigned to improve AFS performance in any way possible.
> It was very bad, then I changed server parameters to tune it to "large"
> server options - that yield enormous speedup, but I still believe I can
> get much more from the system.
>
> There are two things that are, ahem, not as fast as one would like. The
> worse one is directory traversal - moving between levels of directories
> can take 5-10 seconds (on a workstation with 1 Gbit link to AFS server
> in its location). The other one is the upload/download speed itself -
> last time I measured, windows client d/u was 2/5 MB/s - I think I can
> get more than that.
>
> As I'm currently making my way through "Managing AFS" by Richard
> Campbell, I'm not yet fully up-to-speed on OpenAFS inner workings and
> such. Right now I only want to ask: is the design of our AFS system
> correct? Or did the guy introducing it made some short-sighted
> projections which don't hold water in current environment (as
> described). I'm talking here about single-cell design - although I'm no=
t
> sure it's easy to move volumes between different cells.
>
> Other thing I'm worried about: can it be that having the sync site on
> slowest uplink causes everything to slow down? Is there any way to get
> some measurements for this?
>
> Thanks for reading all of this and not falling asleep :-) And waiting
> for you comments,
> Stan
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info


--=20
-----------------------------------------------------------------
Hartmut Reuter                  e-mail 		reuter@rzg.mpg.de
			   	phone 		 +49-89-3299-1328
			   	fax   		 +49-89-3299-1301
RZG (Rechenzentrum Garching)   	web    http://www.rzg.mpg.de/~hwr
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------