[OpenAFS] Re: Preferred way to do backup?

Stephen Joyce stephen@physics.unc.edu
Wed, 5 Jan 2011 12:45:12 -0500 (EST)


I'm glad to hear it's being used! It's worked well for us since 2006.

Separate BackupAFS servers should allow it to scale, as do having your data 
distributed across multiple fileservers. But the first limiting factor most 
sites would probably hit is the raw capacity of direct attached storage 
(unless you're backing up to a SAN, etc), then CPU speed, if you use 
compression.

However intelligently designing your volume sets, keeping them a managable 
size and scheduling full dumps of separate volume sets so that they're 
spread out rather than clustered all in the same night(s), gets you a 
LOOOONG way up the scalability ladder.

For example, I have almost 100 volume sets and no more than 2 of them have 
full dumps performed on any given night. My largest volume set's last full 
dump was 186.3 GB and was dumped at 70.7 MB/s (43.9 mins to backup--all 
volumes were on the same fileserver). That speed is probabaly approaching 
my RAIDs' spindle speeds, since I'm using mid-range hardware.

Compressing the volume dumps (saving 36.7% of that dump's original space) 
took another 33 mins, and is optional.

(And BackupAFS makes it really easy to get statistics like those about your 
backups--it's literally right there on a webpage to look at!)

It's worth noting that the failing/feature Dan mentions regarding hard 
links and pooling applies only to BackupPC. BackupAFS does NOT use hard 
links or pooling, so the data archive can be easily moved from one server 
to another for additional backup (to tape, etc), hardware upgrades, etc.

If anyone has any questions, contact me (privately or via the list, 
depending on the specificity of the question).

Cheers, Stephen
--
If you hold a unix shell to your ear, can you hear the C?

On Mon, 3 Jan 2011, Dan Pritts wrote:

>> Don't forget Stephen Joyce's "BackupAFS". I haven't used it, but I think
>> it's worth mentioning as one of the few backup systems that is actively
>> paying attention to AFS.
>
> We use it (the older version called BackupPC4AFS, we need to upgrade) and 
> it works well for us.
>
> We have a small cell with a single fileserver and < 2TB of data.
>
> I think its scalability will be limited by the I/O bandwidth between your 
> fileservers and your backup server(s).  There's no reason you couldn't 
> have multiple backupAFS servers, they each just would have a different 
> set of volumes to back up.
>
> We also use BackupPC (software on which Stephen based BackupAFS).
>
> BackupPC's major failing and its major feature is that it performs 
> file-level de-duplication by making hard links to a central data pool.
>
> The rest of BackupPC is a reasonable web-gui shell around this backup 
> archive.  Stephen reused this portion and i think it' ought to scale 
> pretty well.
>
> fwiw i say it's a feature, because it works great.
>
> i say it's a failing, because you end up with zillions of hard links in 
> your filesystem.
>
> This means:
>
> there is no good way to duplicate your backupPC archive.  We dd 
> filesystem images to another set of disks and offsite them.
>
> restores of a lot of systems all at once will be awfully slow, since your 
> disks have to seek all over the place.
>
> danno
> --
> Dan Pritts, Sr. Systems Engineer
> Internet2
> office: +1-734-352-4953 | mobile: +1-734-834-7224
>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>