[OpenAFS] byte-level incremental backups using "full+reverse delta"

Marcel Koopmans marcel.koopmans@dsv.com
Fri, 8 Feb 2008 07:52:08 +0100


Well as we are giving away scripting for replication...
This script works *without* configuration and maybe people can learn from
it.
Just run it on volservers and don't forget to get your AFStoken first.

repl_afs.pl

------

#!/usr/bin/perl -w

#
# OpenAFS volume replication
#   by : Marcel D.A. Koopmans of Elysium Open Systems
#
#   version : 1.0.5 ( 2007-08-24 20:25 GMT+1 )

use strict;
use Sys::Syslog;

sub write_log {
   my ( ${level}, ${message} ) = @_;

   openlog( "afs_repl", "", "DAEMON" );
   syslog( ${level}, ${message} ."\n" );
   closelog;
}

sub main {
  my ${err}=0;
  my ( $a, $b );
  my @temp;
  my ( @ip_a, %ip_h );
  my ( %hn_h );
  my ( $volume, $sites, $host );
  my ( ${exitcode} );

  # log start

  write_log( "INFO", "OpenAFS replication start" );

  # get local IP addresses

  open ( FILE, "/sbin/ifconfig -a |" );
  while ( <FILE> ) {
    chomp($_);
    if ( $_=~ m/^\s+inet\s+addr:([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)\s+.*$/ ) {
      if ( ! exists ( ${ip_h{$1}} ) ) {
        ${ip_h{$1}}=1;
        ${ip_a[$#{ip_a}+1]}=$1;
      }
    }
  }
  close ( FILE );

  # sort IP addresses;

  @temp=@ip_a;
  @ip_a=sort(@temp);
  $#temp=-1;

  # resolve hostnames

  if ( $#{ip_a} > -1 ) {
    for ( $a=0; $a<=$#{ip_a}; $a++ ) {
      open ( FILE, "/usr/bin/getent hosts " . ${ip_a[$a]} . " |" );
      while ( <FILE> ) {
        chomp($_);
        @temp=split(/\s+/,$_);
        for ( $b=0; $b<=$#temp; $b++ ) {
          if ( ! exists ( ${hn_h{$temp[$b]}} ) ) {
            ${hn_h{$temp[$b]}}=1;
          }
        }
      }
      close ( FILE );
    }
  }

  # Get openAFS volumes

  $a=-1;
  open ( FILE, "/usr/bin/vos listvldb 2> /dev/null |" );
  while ( <FILE> ) {
    chomp($_);
    $_=~ s/\s+$//;

    if ( ( $a == 2 ) && ( $_=~
m/^\s+server\s+(.+)\s+partition\s+\/vicep[a-z]+\s+RW\s+Site$/ ) ) {
      ${host}=$1;
      if ( exists ( $hn_h{$host} ) ) {
        if ( ${sites} > 1 ) {
          ${exitcode}=system("/usr/bin/vos release " . ${volume} . " >
/dev/null 2>&1" );
          ${exitcode}=${exitcode} / 256;

          if ( ${exitcode} == 0 ) {
            write_log( "INFO", "release of volume " . ${volume} . "
success" );
          } else {
              write_log( "ERR", "release of volume " . ${volume} . " failed,
returned exit code " . ${exitcode} );
          }
        }
      }
    }

    if ( ( $a == 1 ) && ( $_=~
m/^\s+number\s+of\s+sites\s+\-\>\s+([0-9]+)$/ ) ) {
      $a=2;    # sites
      ${sites}=$1;
    }

    if ( ( $a == 0 ) && ( $_=~ m/^(.+)$/ ) ) {
      $a=1;    # volume name
      ${volume}=$1;
    }

    if ( $_=~ m/^$/ ) {
      $a=0;    # new block
    }

  }
  close ( FILE );

  # log end

  write_log( "INFO", "OpenAFS replication end" );

}

exit ( main );

------

With kind regards,
  Marcel





-----Original Message-----
From: openafs-info-admin@openafs.org
[mailto:openafs-info-admin@openafs.org]On Behalf Of Adam Megacz
Sent: 07 February 2008 19:22
To: openafs-info@openafs.org
Cc: hcoop-sysadmin@hcoop.net
Subject: [OpenAFS] byte-level incremental backups using "full+reverse
delta"



In case anybody else finds this useful, I've worked out a system for
doing backups of an AFS volume-set with the "full and reverse delta"
style of incremental backups.

  /afs/megacz.com/srv/bin/dump.sh

Much like rdiff-backup, this keeps a complete copy of the most recent
backup, and stores all previous backups as deltas against the
*following* backup (ie in the "reverse" direction).  This means you
can "truncate" the backup history whenever you like, rather than only
at full-backup intervals, and there's never any reason to keep more
than a single "full" backup around.

Deltas are done with xdelta3.  Although it is very CPU-intensive, it
seems to be very good at representing minor changes (ie a few bytes)
to very large files, which "incremental dumps" cannot do, and which
(in my experience) rdiff did not do as well as I would like.  A "test
restore" of each delta is done before deleting the old volume, so you
don't need to trust that the xdelta3 algorithm is correct -- you just
have to trust it that it is deterministic.

Hope you find this useful,

  - a


#!/bin/bash -e

# A script for "full and backward diff" style incremental backups of
# one or more cells' AFS dumpfiles using xdelta3 for diffing.  Note
# that xdelta3 will efficiently represent minor changes to very large
# files, which AFS "incremental dumps" cannot do.

#
# IMPORTANT: you must use xdelta3 version SVN.227 or later -- this
# will become release 3.0u at some point.  There is a precompiled x86
# deb at /afs/megacz.com/debian/xdelta3/xdelta3_svn227.deb
#

# Change these variables to suit your needs.  Note:
# - Backups are kept in $BACKUPDIR/year/month/day/cell/vol.afsdump.
# - At all times the latest backup is kept in "full" form, and all
#   previous backups are kept as "reverse diffs" against the backup
#   from the day AFTER them.  This lets you easily "truncate" the
#   backup history at any time.
# - The symlink $BACKUPDIR/yesterday points to yesterday's backups, if
#   there were any; upon completion of today's backups, yesterday's
#   will be converted into reverse diffs.

BACKUPDIR=/vol/dumps
LOGFILE=/var/log/dump.log

exec 2>&1
exec &> LOGFILE

mkdir -p $BACKUPDIR
cd $BACKUPDIR
DIR=`date +%Y/%m/%d`
mkdir -p $DIR
rm -f today
ln -fs $DIR today

dump() {
  CELL=$1
  VOL=$2

  OLD=yesterday/$CELL/$VOL.afsdump
  NEW=today/$CELL/$VOL.afsdump

  aklog -c $CELL
  mkdir -p today/$CELL

  echo
"===========================================================================
==="
  echo "dumping $VOL in cell $CELL..."
  echo
  tokens
  vos examine $VOL -cell $CELL
  echo

  if test -e $OLD
  then \
     vos dump $VOL -file $NEW -clone -cell $CELL
     xdelta3 -evfs $NEW $OLD $OLD.rev-xdelta || true
     ((xdelta3 -dcvfs $NEW $OLD.rev-xdelta | cmp - $OLD) && (echo
'removing...'; rm $OLD)) || true
  else \
    vos dump $VOL -file $NEW -clone -cell $CELL
  fi

  echo "  done dumping $VOL."
}

# each command below is in the form "dump <cellname> <volumename>"
dump megacz.com               pub
dump megacz.com               srv
dump megacz.com               web
dump megacz.com               mail
dump megacz.com               work
dump megacz.com               user
dump megacz.com               user.megacz
dump megacz.com               root.cell
dump research.cs.berkeley.edu root.cell

# update the "yesterday" symlink so tomorrow's backups will deltify todays'
rm -f yesterday
mv today yesterday


############################################################################
##
# old rdiff-based code; please ignore
############################################################################
##
#
# rdiff delta     $OLD.sig $NEW      $OLD.fwd
# rdiff patch     $OLD     $OLD.fwd  $NEW
# rm $OLD.fwd
#
#    test -e $OLD.sig || rdiff signature $OLD     $OLD.sig
#    vos dump $VOL -clone -cell $CELL \
#        | tee $NEW  \
#        | (rdiff -- signature - - \
#           | tee $NEW.sig \
#           | rdiff -- delta - $OLD - \
#           | (tee $OLD.rev \
#              | rdiff -- patch $NEW - - \
#              | (cmp $OLD - && (echo "removing old dump..."; rm -f $OLD))))
#    rm -f $OLD.sig

_______________________________________________
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info