[OpenAFS] Databases & AFS (revisited)
Jeffrey Hutzelman
jhutz@cmu.edu
Thu, 18 Jan 2007 00:03:09 -0500
On Saturday, December 23, 2006 06:14:32 PM +0100 Davor Ocelic
<docelic@mail.inet.hr> wrote:
> Looking at [2], which appears to be CMU's class assignment, the
> students are supposed to create a Postgres database within their
> AFS volumes, without a word of problems that might create.
A bit delayed, but...
That document is over 3 years old; AFAIK it does not represent a "current"
assignment for any class. It represents one assignment for one class,
developed by the faculty teaching that class. It should certainly not be
taken as CMU's position on whether putting database files in AFS is a good
idea.
Some applications, including database servers, use byte-range locking.
Depending on your platform, byte-range locks may be handled locally but
turned into whole-file locks on the server, handled locally but not
reflected on the server at all, or they may be completely ignored. UNIX
applications which depend on working byte-range locks will generally not
work when the same file is used by multiple AFS client systems at the same
time; however, many of them will work fine if all programs using the file
are on the _same_ AFS client, or if there is only one such program at a
time.
Even without the potential locking problems and performance penalties,
running a database server or other long-running service backed by data
stored in AFS (or any non-local filesystem) is fraught with peril. Such a
service, running on a perfectly working machine, can unexpectedly lose
access to its data due to network problems, a fileserver outage, or even
simple things like loss of tokens. This is not something I would recommend
for a production service.
However, short-term, light-duty uses like the postgres assignment you
mentioned will probably be OK. In these situations, the user is running
the database server using his own tokens, the database files are not
accessed by anything else, and the server only runs as long as the user is
logged in (in fact, the "servers" mentioned in this assignment are actually
not servers at all, but public timesharing systems -- the users have only
ordinary unprivileged access, and the machines reboot every night). Since
the database does not contain any critical data, a network or fileserver
outage creates an inconvenience but no serious data loss.
-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
Sr. Research Systems Programmer
School of Computer Science - Research Computing Facility
Carnegie Mellon University - Pittsburgh, PA