[OpenAFS] Re: bos backupsys silently failing for one or two volumes?

Andrew Deason adeason@sinenomine.net
Mon, 16 Jul 2012 17:02:43 -0500


On Mon, 16 Jul 2012 15:07:10 -0400
Steve Simmons <scs@umich.edu> wrote:

> Bos config for backupsys:
> 
>   bnode cron snapshot 1
>   parm /usr/sbin/vos backupsys -se localhost -localauth
>   parm 18:00
>   end

This may not be a 'silent' error; you're not recording the output.

> This particular volume came up on June 7. vos examine -fo verified the
> .backup being older than expected. We did a manual vos backup on it
> and was fine for a few days. It occurred again on the same value June
> 12, we did a manual backup again. On Jun 19 it hit again. This time we
> deliberately didn't do a manual backup, the next day it reported being
> an additional 24 hours out of date. We forced a backup. On Jun 22 it
> again reported being out of date, we let it go and it came up again on
> the 23rd. At that point we vos moved it from one partition to another
> on the same server, then back to the original partition. The next day,
> it once again had not gotten a .backup, but the next day it did
> (weekend). On July 5 it again had an old .backup. We then moved it to
> another server, and it has not yet thrown problems. We have had
> several other volumes generate similar errors, tho none this
> persistent.

When you manually issue a backup, are you running 'vos backup' ? Try
running 'vos backupsys' with the prefix set such that it only backs up
that one volume, and see if that makes a difference.

Alternatively, you can make the cron job save stdout/stderr somewhere,
and examine it after you notice that this has occurred.

-- 
Andrew Deason
adeason@sinenomine.net