[OpenAFS] OpenAFS in a production environment

Jeffrey Hutzelman jhutz@cmu.edu
Thu, 01 Sep 2005 21:43:32 -0400

On Friday, August 12, 2005 03:26:20 PM -0700 Lester Barrows 
<barrows@email.arc.nasa.gov> wrote:

> First, performance in general is not going to be as good as NFS for
> read/write  data on the local network. With the 1.2 series clients,
> performance was  actually rather terrible in our configuration for a
> single client. The 1.3  series seems to have come a long way toward
> fixing this, although write  performance is still slower than NFSv3.
> OpenAFS has file locking semantics  which seem to strongly favor reads
> over writes.

It's not entirely locking that's the issue.  OpenAFS caches reads, but 
writes must be pushed to the fileserver.  For files which are heavily 
shared, every write to a file requires notifying every other client using 
that file.  So yes, this is slower.

There are certainly some performance issues, but they're rather more 
complex than is suggested here.  If it were easy, we'd have fixed it by now.

> Second, OpenAFS doesn't seem to work very well with NATs. This seems to
> mostly  be an artifact of it being a UDP-based protocol. If you have a
> lot of clients  behind NATs, OpenAFS may not be suitable for your use.
> The developers do not  seem to be interested in a solution for this at
> the moment, although to be  fair there are a number of other things they
> are working on.

OpenAFS _clients_ work fine behind a NAT that provides reasonable 
connection tracking and does not time out UDP port associations too 
quickly.  For those that do time out such associations quickly, it is 
possible to increase the frequency with which the cache manager polls the 
fileserver, resulting in a "keep-alive" effect, but this has the 
disadvantage of additional load on the network and fileservers.

That said, NAT's break the Internet.  Avoid using them if you can.

> Third, client support for newer platforms has improved since we started
> using  it but it isn't perfect yet. OpenAFS seems to just now be
> stabilizing on the  Linux 2.6 series kernels in the newer 1.3 releases.
> Older releases of OpenAFS  don't support 2.6 at all with PAGs to my
> knowledge. I wouldn't recommend  running a pre-release filesystem (or
> pre-relase anything) on production  systems. OS X support seems good all
> around, but tends to come out a while  after each new release of the OS.

At this point, OpenAFS 1.2 is pretty stale.  We did indeed decide not to do 
2.6 support in that version, but instead focus on the 1.3/1.4 branch, so if 
you want 2.6 support, then you'll need something relatively recent.  I 
suggest 1.4.0rc2 (or better, RC3 when that gets released).  Really, things 
have been pretty stable for some time now, we've just been trying to squish 
as many bugs as we can before a 1.4 release.

Locally, we are running 1.3.85 or so in production on 2.6 machines on both 
i386 and amd64, and have seen no problems.

PAG support has been available for quite some time.  Yes, if you run an old 
enough version you won't get PAG support.  So don't run something that old.

> Fourth, backups can of course be interesting due to the way OpenAFS
> stores  files. The included backup system will take some scripting if you
> want to use  it reasonably. I've been able to make it work well enough,
> but you might be  better off with a third party backup solution if you
> don't want to invest a  lot of time in it.

Frankly, I hate the included backup system.  However, there are a number of 
good alternatives available, depending on your environment.

-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
   Sr. Research Systems Programmer
   School of Computer Science - Research Computing Facility
   Carnegie Mellon University - Pittsburgh, PA