[OpenAFS] Failover
Jeffrey Hutzelman
jhutz@cmu.edu
Sat, 31 Dec 2005 20:03:40 -0500
On Saturday, December 31, 2005 12:36:40 AM -0600 Troy Benjegerdes
<hozer@hozed.org> wrote:
> The advantage of AFS over a single system is you can have as many
> incoming MTA machines, and imap servers as you want.
Yes, you can. But as the volume gets large, especially for any given
mailbox, the performance goes to hell. The problem is that whenever you
file a message into a mailbox, you change the directory containing the
mailbox. That means that if any other AFS client is also accessing that
directory, it has a callback that has to be broken (while YOU wait), and
then it has to fetch the entire directory again in order to be able to do
the next file lookup.
Once upon a time, more or less all of Carnegie Mellon's messaging needs
(mail, netnews, bboards) were handled by the Andrew Messaging System, a
distributed system based on AFS. AMS was an integrated part of the Andrew
project, and unlike any mail system in wide use today, was designed from
the ground up to take advantage of a distributed computing environment and
particularly a distributed filesystem. Most major components of the system
stored data in and communicated via the filesystem. Incoming MX's,
outgoing mail gateways, delivery, bboard filing, etc. could all run on
multiple machines, and it was possible to add or remove machines in any of
those pools at will.
Several years ago, Carnegie Mellon abandoned that system, choosing instead
to expend huge amounts of developer time on developing, maintaining, and
supporting an enterprise-grade distributed IMAP server package. The Cyrus
IMAP system has consumed more than an entire full-time employee for many
years now, and there is no sign that will change anytime soon.
One significant factor in the decision to go down that path was the fact
that AMS had serious scalability problems, largely because of the issue I
described above. You could add more mail delivery systems, but that meant
more callback breaks and more fetches of large directories from the
fileserver. Sure, it was necessary to develop software because there was
no off-the-shelf solution with the required robustness and stability. And
participation in standards efforts (and implementation of those standards)
was needed in order to insure it would at least be possible to use
off-the-shelf _clients_. But without the serious performance problems AMS
was having, there would have been no need to consider changes to messaging
infrastructure at all.
I very much recommend against trying to store mail in AFS. There is no
gain to be had in reliability, scalability, or performance, and there are
any number of potential problems. If what you're trying to accomplish is
to get those features in a distributed mail server system, I suggest
looking at http://asg.web.cmu.edu/cyrus/
-- Jeff