[OpenAFS] Security Advisory 2016-003 and 'bos salvage' questions
Garance A Drosehn
drosih@rpi.edu
Wed, 15 Feb 2017 13:48:52 -0500
I had an odd situation pop up when upgrading to OpenAFS 1.6.20.1.
The description of the security advisory at
http://www.openafs.org/pages/security/OPENAFS-SA-2016-003.txt
says:
> We further recommend that administrators salvage all volumes with
> the -salvagedirs option, in order to remove existing leaks.
I'm upgrading our file servers from OpenAFS 1.6.14 to 1.6.20.1 (and
also upgrading our RHEL 7.3 kernel from version 3.10.0-327.el7 to
version 3.10.0-514.6.1.el7 on those machines).
We had some other changes going on as part of this, and I wanted to
minimize how much disruption might occur if something went wrong
with any of the changes. So I:
1) created a small-capacity file server with the new OpenAFS
and newer kernel.
2) 'vos move'-ed some of the busier non-replicated volumes from
an existing file server to the new file server.
3) upgraded that existing file server.
4) moved volumes back.
... wait a few days, and repeat for a different file server.
There's no step there where I did a 'bos salvage -salvagedirs'. I
had forgotten about that advice in the advisory. I did all these
steps for a few file servers with no problems at all.
On my most-recent server, my script which does step #4 moved 26
volumes, and then hit this error on the 27th one:
> Failed to move data for the volume 53.....75
> VOLSER: Problems encountered in doing the dump !
> Recovery:Failed to start transaction on 53.....75
> Volume needs to be salvaged
Is this simply that 'vos move' in version 1.6.20.1 is doing more
consistency checks than version 1.6.14 did? The last-update time
for that volume is in Oct 2008, so it's not like it has been
changed recently. And the new fileserver for temp-storage hasn't
even restarted since I've started this upgrade. The volume is
still on-line for AFS users, and at the user level it seems to
be in fine shape. From what little I can tell, I can access all
47,000 files without any errors.
Is it reasonable for me to just do the 'bos salvage' for this
specific volume, then do the 'bos salvage -salvagedirs' for the
entire partition on the temp-space fileserver, and then do that
on the existing (newly-upgraded) fileserver, and then restart
my vos-moves?
Also, can I do the 'bos salvage -salvagedirs' while the 'fs'
process is running, or do I need to stop and restart 'fs'
around that salvage command?
I did a 'bos salvage -help', and see a number of interesting
options are available which I don't see in the man page or the
documentation. I'm inclined to go with '-orphans ignore' for
a first-run, and then see what is listed for orphans. I'm also
curious about the '-nowrite' option. Will that do a thorough
check of what 'salvage' would need to do, but without making
any modifications to the volume?
--
Garance Alistair Drosehn = drosih@rpi.edu
Senior Systems Programmer or gad@FreeBSD.org
Rensselaer Polytechnic Institute; Troy, NY; USA