[OpenAFS] Some backup advice needed.

scorch scorch@muse.net.nz
Fri, 21 Apr 2006 05:52:19 +0200

Steve Devine said the following on 2006-04-20 17:52:
> All,
> We use the native afs backup. We currently do a full backup of the whole 
> cell weekly followed by daily incrementals until we cycle back to Monday 
> midnight and start the next full.
> As the size of the cell (user vols ) grows our backup window is getting 
> way too big. Often it stretches into late Wed before it completes.
> I am wondering how many are doing weekly fulls? I am considering a 
> monthly full with incrementals the rest of the month but thats kinda 
> scary if maybe your full is not good.
> Opinions?

hi steve,

i hope this helps you. normally this would be 1000euro + a few days of 
onsite consultancy in a fancy suit :-)

if you are able to meet your organisation's SLOs for backup success, data 
retention, and random restore testing does work (you do random restores 
don't you??) then you don't really have a problem. it just feels uncomfortable.

out of curiousity, how many hosts, data volume do you have? what backup h/w 
do you use?

you have 3 variables you can control that are of interest:

#1 time window
#2 volume
#3 throughput

#1 time window is easiest to cover, either backup more often, or less often.

less: move to a monthly full for your user volumes, all on 1 day -> 1/4 
load, but still the same duration.
more: spread the full user volume backups out across the week -> 1/7 load, 
each day. has the side-effect of reducing the per-day volume, and thus your 
time window.

#2 backup less data. this is the cheapest solution. list the largest 
volumes, or the ones with the oldest files (maybe a bit more effort in AFS 
but you can build/buy tools to help with this), or with the smallest volume 
of INCR backups/volume (i.e. least changes), etc etc etc, then agressively 
target those files/people for migration to offline or near-line storage.

it helps to send a few pointed emails, remind people that storage doesn't 
come cheap, & ideally provide the organisation mgrs with a "cost" for 
keeping this stale data online. either move it to DVD/CD/tape or give them a 
"holding" area that gets backed up only monthly (or whatever makes sense) 
instead of weekly. e.g. change the volume names to "archive.whatever". for a 
uni, a hall of shame by department, conspicuously posted by the chancellor's 
  office, is a start.

#3 throughput
you have a few options here depending on cash. the goal of backup is to put 
the bottleneck at the tape drive -- and still be capable of meeting your SLO 
for restore. what's your bottleneck? where is it? good drives today can 
handle 300GB/hour native, and I've had them up to 500 with a strong tail 
wind. but if your data is flowing from a workstation, over 10Mb ethernet, 
the drive will never achieve its stated throughput - not to mention restore 
problems. you do do random restore testing, don't you?

how fast can you dump data to /dev/null? this is your upper limit. increase 
this by trial&error of multiple streams (volumes) at the same time, or 
spread the load out over multiple hosts, or get a bigger box, or a bigger 
disk array.

there are a few free tools to test this at HP's website 
+ some pretty sound advice.

use a separate backup LAN
spread backup servers across subnets to reduce cross-router traffic
use multiple backup servers in each subnet
use trunking (FEC, AGP etc) to backup servers
connect them to CORE not EDGE switches

ethernet expected throughput	60% rated wire speed
LAN backups use less CPU than SCSI/FC
use 1 GB NIC / subnet
100Mb/s	(single NIC)		= 25 GB/hour
200Mb/s	(dual NIC with FEC/AGP)	= 40 GB/hour
1 Gb/s	(single NIC)		= 40 GB/hour
2 Gb/s	(dual NIC with FEC/AGP)	= 50 GB/hour
10 Gb/s (single NIC)		= 200 GB/hour
note that 3 NICs trunked on many (solaris, hpux, windows) platforms seems to 
give less throughput than 2 of the same type. go figure.

100FDX = 100 Mbits/sec maximum
= 100/8 Mbytes/sec
= 100/(8*1024) Gbytes/s
= (100 * 3600) / (8 * 1024) Gbytes/hour
=  0.6 * (100 * 3600) / (8 * 1024) Gbytes/hour at expected ethernet capacity

drive	SCSI	SAN
lto1	80	70
lto2	120	100
sdlt2	100	80
sdlt1	70	55
dlt80	25	20

NB the attached primary storage makes a big difference -- above speeds were
measured in production, using U3 SCSI RAID arrays; commercial arrays 
(EMC,HDS,XP) can provide much higher throughput (on quality server kit), and 
also support multiple concurrent streams. block size can make a big 
difference depending on the application.

my experience is that SAN tends to be a bit slower, depending on the type of 
data being backed up. fiddling with block sizes can boost this up to around 
5% of native SCSI speed.

NB the drives, HBA themselves & PCI buses all have limitations
max 2 drives / FC HBA
1 drive / SCSI U3 HBA
4 drives / PCI bus
1 LTO1 drive uses  400MHz CPU & 64MB RAM
1 LTO2 drive uses  800MHz CPU & 64MB RAM
1 LTO3 drive uses 1000MHz CPU & 64MB RAM

commercial backup software kicks arse. don't expect old tools like tar, 
cpio, etc, ever to achieve the same throughput. if you can, demo a few. the 
big ones are in no order, legato, symantec netbackup, hp dataprotector, ibm 
tivoli, with wide platform support.

consider getting some secondary disks, spool your backups over backup LAN 
onto that, & then backup to tape directly 1x per week to a small autoloader 
with a very fast drive. you can put this mezzanine host into a separate 
site/building, and still meet DR requirements. if you already have a SAN + 
tape drives, and have more cash, look into the new virtual tape libraries 
offered by the usual suspects.

i've not used amanda or bacula, but either of these opensource products 
might be enough for you.

find the bottleneck. eliminate it. repeat. start off at the host end. what's 
your max. throughput to /dev/null? use ftp between the 2 hosts - how fast 
does it go?  how is your LAN structured if you're doing LAN backups? test 
your tape drive using something other than tar.

& you do random restores don't you? :-)

cheers, scorch
out of the frying pan and into the fire