[OpenAFS] Understanding questions backup volume
Jeffrey Hutzelman
jhutz@cmu.edu
Thu, 09 Feb 2006 18:53:48 -0500
There seems to have been some confusion in this thread, so I guess I
will speak up...
On Thursday, February 09, 2006 11:43:45 AM +0100 Lars Schimmer
<l.schimmer@cgv.tugraz.at> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi!
>
> I start using backup volumes ;-)
> It is fairly easy to create one and mount them.
> But: Where is the difference between RO copies and a backup volume?
For a moment, let's limit ourselves to an RO volume on the same site as the
RW. In that case, as far as the fileserver is concerned, there is almost
no difference. Both RO and backup volumes are efficient copy-on-write
clones of a read-write parent volume. Neither is stored as a "diff" with
respect to the other; rather, both volumes have complete, independent vnode
indexes, but they share storage for files that have no changed since the
clone was updated.
When a shared file is modified in the parent volume, a copy is made, so
that the parent volume can be updated with the new contents while the
clones retain the original contents. If there are multiple clones, they
continue to share storage for that file. Note that the clones themselves
are always "read-only" in the sense that they are not writable; only the RW
parent volume can be modified.
The fileserver does treat BK volumes specially in a couple of ways.
Notably, it keeps track of which volume is "the" backup volume for a given
RW parent, if one is present at all, and it keeps track in the parent's
volume header of the last time such a volume was cloned. No similar
tracking is done for RO volumes or other clones (in fact, to the fileserver
there is _no_ difference between "real" RO volumes and those created with
the 'vos clone' command).
> I know, backup volumes should be used for backup, RO for distributing
> data all over the cell.
Well, yes; those are the uses for which these types were intended. Note
that there is a variety of special handling in clients related to these
volume types. For example, volume names ending in .backup or .readonly are
looked up in the VLDB under the parent name, but are considered to refer to
the RO or BK volume. The cache manager has the concept of an "RO" path; a
normal (#) mount point in an RO volume resolves to an RO volume, if there
is one. The VLDB records a separate set of sites for RO clones, while the
BK volume is presumed to live in the same place as the RW (since it must).
> A backup should be made of the backup volumes, because this doesn't lock
> the RW volumes for a long time.
Backup volumes can be used for this purpose, and indeed that was part of
the reason for having them, but as was pointed out, any clone will do --
you can even have 'vos dump' create a temporary one on the fly for you.
However, another part of the original intent was that backup volumes would
be cloned once per day (there is even a 'vos backupsys' command for this
purpose), and mounted where users could find them. So, if a user
accidentally deletes a file, he can retrieve it from the backup clone
without bothering someone to do a restore.
> And if I vos dump the backup volumes to a backup server (amanda-afs or
> just plain dump) I could rebuild the backup volumes. Does this help me
> in case of a lost RW volume?
Well, if you 'vos dump' the backup volume, you can restore an RW volume
from that dump. Or, you can restore an RO copy with a different name, with
'vos restore -readonly'. That's another thing RO volumes are good for --
you can have standalone RO volumes with no RW parent, either as a result of
a readonly restore, or as a result of replication.
> At least a RO copy could be converted to a RW volume in nearly NO time,
> but a backup volume?
I've not looked at the code in a lot of detail, so I don't know whether it
will work to "convert" a clone (RO or BK) which is colocated with a parent.
The convertROtoRW operation is designed to let you "promote" an RO volume
that lives on a different partition from the RW, in the event the partition
containing the RW fails. Since RO and BK volumes living on the same
partition as the RW share storage with it, it is unlikely that they will
survive intact if something destroys the RW parent.
> Our cell is designed to have a RO copy of every RW volume.
> And if one RO copy of a RW volume resist on a file server housed in a
> datacenter "far away" I've got a quick and easy 1-day-backup in case of
> big error here. With the ROtoRW convert the cell is back up very fast.
> So why use backup volumes?
I think I answered that above. You're doing something very nonstandard -
trying to use replication to provide failure recovery for volumes that are
RW by nature. The replication feature was designed to provide reliable,
scalable storage for data which is accessed by many clients and changed
infrequently. In the Andrew system, it was originally used for
distributing system software, and that is still its primary use today.
> Are backup volumes built incremental?
Backup and RO clones are built in the same fashion, by making a copy of the
parent volume's vnode index and incrementing the refcounts on all of the
files in the volume. When an existing backup clone is recloned, the
refcounts on the files that were present in the previous volume are
decremented, and any that are no longer referenced are freed. No data is
copied in any event.
> Because with only RO copies, I get a 1-day-backup, but I need a
> long-term-backup with incremental backups.
Keeping such backups in the form of online volumes is not terribly
efficient. Long term backups should be kept in the form of volume dumps,
possibly compressed, and stored on disk and/or tape. There are several
options available for managing backups; you can use 'vos dump' or the
backup system included with OpenAFS, and there are a number of third-party
packages, both open-source and commercial, which offer AFS backup support.
All of these approaches are capable of making use of incremental volume
dumps.
-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
Sr. Research Systems Programmer
School of Computer Science - Research Computing Facility
Carnegie Mellon University - Pittsburgh, PA