[OpenAFS] R/W replication

Dirk Heinrichs heinrichs@qis-systemhaus.de
Thu, 15 Feb 2001 11:27:38 +0100


Derek Atkins wrote:
>=20
> Dirk Heinrichs <heinrichs@qis-systemhaus.de> writes:
>=20
> > Ok, if it is the link that fails, and not one of the servers, you are
> > right. You would need some kind of conflict resolution. I wonder how =
the
> > Coda folks solved this. They support both disconnected operation and =
rw
> > replication. I think they use some kind of transaction processing
> > system, similar to  databases. Would be nice to hear from some practi=
cal
> > experience with this (maybe I'll have to ask on their mailinglist).
>=20
> Well, OpenAFS should support disconnected operations, once it is
> up-ported by the UMich folks (I'm told they are working on it, in
> their copious amounts of spare time ;) From memory, disconnected ops
> work by saving all "changes" in a local log and then replaying the log
> when you "reconnect" to the network.  Because connecting and
> disconnecting require user interference (i.e. an fs command), the UI
> can work with the user to resolve any conflicts that arrise from the
> transaction log playback.
>=20
> Resolving conflicts in an automatic manner can be, um, challenging.  I
> don't know how CODA does it, honestly.  The way I would do it in
> OpenAFS, today, is operationally (especially if the issue is having
> lots of hosts writing to different files).  Instead of having one
> "replicated" RW volume, I would have a bunch of (different) RW
> volumes, and have each "file" be in a different volume.  You could
> even use RO clones for the other files, and have each "host" know to
> mount their particular RW volume and then release the volume whenever
> they make changes.
Ok, I did some investigations on the Coda website to find out how things
are done. I found some papers describing how server replication is done
in Coda.

First, Coda has the RPC2 component, which allows for multicast RPCs.
This enables Coda to use a read-one, write-many approach, where a file
is read from one server (or from the cache) and, upon change, written
back to all replicas at once as one (atomic) operation. Now for tracking
the replicas of a particular volume, Coda uses Volume Storage Groups
(VSGs) which list the servers that have replicas of this volume. Then
there is the Accessible VSG (AVSG) which is updated when a server listed
in the VSG becomes inaccessible. When the server in question is
reconnected again, Coda replays the logs to bring this server in sync
with the others.

It was stated that with Coda, one could even replace a defective disk in
one of the servers, recreate the volume(s) which where stored there and
just do an 'ls -lR' on the relevant directory tree to get the new disk
filled up again.

I found those docs on
http://www.cs.cmu.edu/afs/cs/project/coda/Web/docs-coda.html

Anyway, from the mailinglist I found out that there seem to be some
stability issues as well as the lack of a (useable) Windows port, which
would be needed for some users in my company.

Bye...

	Dirk
--=20
Dirk Heinrichs		| Tel:	+49 (0)241 413 260
QIS Systemhaus GmbH	| Fax:	+49 (0)241 413 2640
J=FClicher Str. 338b	| Mail:	heinrichs@qis-systemhaus.de
D-52070 Aachen		| Web:	http://www.qis-systemhaus.de