[OpenAFS] Replication

Sun, 2 Mar 2003 22:30:35 -0700

To the best of my knowledge, OpenAFS does not include any
scheduled-replication mechanism.  Transarc/IBM AFS did not, and I don't
remember any such change being made to OpenAFS.  DFS did/does.

In an environment with over 15000 replicated AFS volumes, we quite
successfully managed to automate "vos release" -- the process that manages
the updates of the read-only replica.  Umm, you use the word "mirror" to
describe a replica, and though you probably already know this, I just want
to make sure:  The replicas aren't really mirrors of the RW, in the sense of
RAID disk mirroring, because the latter is happening in real time and AFS
replication isn't.

Our approach to automating the updates in our environment was:

1) choose a secure machine for managing the replicas -- you can run the
"vos" commands as root with the "-localauth" switch to avoid having to get
admin tokens
2) write a script that,
    a.  runs vos examine against the read-only volume of interest, parse out
the update times and do a "sort -u" or the equivalent
    b.  we might have run vos examine against the RW as well -- I don't
remember if you get the RW update time from the vos exa of the RO volume
    c.  So you've got the update times of all of the ROs and of the single
RW
    d.  If the "sort -u" in step (a) results in more than one update time,
the ROs are out of synch with each other -- so we'd do a vos release to
correct this
    e.  If the ROs are in synch and if the RW is "older" than the ROs, then
the RW hasn't been changed since the last vos release -- so don't do
anything
    f.   If the ROs are in synch but the RW is "younger" than the ROs, then
the RW has been changed since the last vos release -- so update the ROs with
vos release
    g.  Check the result codes from vos release
    h.  Repeat the vos examines and reinspect the update times.
    i.   Repeat the vos release and/or use vos release -force if steps (g)
or (h) indicate a vos release failure

It might seem odd to do a vos examine rather than just do the "vos release"
and be done with it.  After all, the vos release only updates files that
have been changed, and if none have been changed, it shouldn't take very
long.  Right?

You might want to repeat the benchmarking we did -- on your networks --
which was to vos release a volume to update the ROs, make sure they're
updated, and then do a "time vos release <volname>" and see how long it
takes for AFS' vos release to not-update the ROs.

The volumes we were dealing with were distributed world wide over some
questionable network links, and this certainly affected our results.

What we found -- again in our environment, and your mileage may vary -- was
that the "vos examine" operation and concomitant parsing and comparing of
update times took several seconds, while the vos release of an unchanged RW
took at least ten or more seconds.  The vos examine method was ten to twenty
seconds faster in determining whether or not a volume actually needed to be
released.

With 15000 volumes to look at, this was significant.  We were sometimes
updating hundreds of replicas across the questionable networks.  (Frame
relay with a burst rate of 256K.  K-what I'd have to ask.  Take a bit,
divide it into little pieces, call it a bit-smidgeon.  We might have been
getting 256K bit-smidgeons.)

If your RWs and ROs are network adjacent, as yours sound to be, and you only
have a few AFS volumes (or maybe a few hundred, can't speak for your
environment) an unneeded "vos release" might be a very quick operation, and
the whole vos examine approach a waste of time.

Your second question is whether or not the vos release is going to affect
your system resources.

Yes.

A vos release will always affect network and disk activity at the server
where the RW "master" is located, and at each of the RO sites.  The changed
files are read from the RW and shipped, in their entirety, to the RO sites.
NB -- the changed _files_ are sent in their entirety, not the changed
_volume_.  (vos release -force will force all files in the volume to be repr
opagated to the RO sites.  Use wisely, especially for big volumes.)

Whether or not a vos release or series of vos releases are even noticeable
will depend on how much is changing and how rapidly, and how frequently
you're doing the vos release.

How frequently _are_ you planning to update the ROs?

When you say mirror, do you mean real time mirroring, as in RAID, but over
AFS?

If so, I suspect you'll want to 1) keep in mind that it's not going to be
real time mirroring, because reading from the RW, pushing across the
network, and writing to the RO is going to take time and 2) you're going to
need to do some benchmarking to see how often vos releases are practical.
Running vos release in a tight loop strikes me as a not-good idea, but
again, your mileage may vary and I have no idea of the sizes of
files/volumes you're talking about.

You might look at the "volinfo" command -- it's helpful for identifying
volumes that are being heavily written.

Some of your website is almost certainly static, some nearly static, some
dynamic.  If you can identify the volumes that are static, near static, and
dynamic, you might find that with a little reorganization of volumes
(splitting, for example), you can keep the number of volumes that are
changing to a minimum.

Wish I could give you a definitive answer.  I guess "It depends" might be
considered a definitive answer :)

Gather some data ("vos release" vs. "vos examine" timings, number of
volumes, which are static, which are dynamic, which are in between), see if
reorganization is going to help, and feel free to get in touch.

Kim
---------------------------------------------------
Dexter "Kim" Kimball
CCRE, Inc.
dhk@ccre.com

----- Original Message -----
From: "Chris Snyder" <csnyder@mvpsoft.com>
To: <openafs-info@openafs.org>
Sent: Sunday, March 02, 2003 8:33 PM
Subject: [OpenAFS] Replication

> We're currently running a small OpenAFS system with a single server. I'm
> currently researching what it would take to add a second server for
> reliability purposes. This is mainly to ensure uptime of our web sites,
which
> are stored on AFS. I'd like to have the second server be a read-only
mirror
> of the master, updated automatically. That way, if the master goes down,
the
> web server can still get access to the web site files, albeit read-only
> (which isn't much of a problem).
>
> I'd like to use replication to accomplish this. As far as I can tell,
> replication has to be done manually. Is there a way to have replication be
> done automatically other than a cron job? If a cron job is the only
method,
> will the performance of the servers suffer at all during the replication?
> Thanks in advance.
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info