[OpenAFS-devel] Re: [OpenAFS] Re: 1.6 and post-1.6 OpenAFS branch management and schedule

Tom Keiser tkeiser@sinenomine.net
Fri, 18 Jun 2010 11:17:52 -0400


On Fri, Jun 18, 2010 at 3:55 AM, Jeffrey Hutzelman <jhutz@cmu.edu> wrote:
> --On Thursday, June 17, 2010 04:12:48 PM -0500 Andrew Deason
> <adeason@sinenomine.net> wrote:
>
>> On Thu, 17 Jun 2010 15:54:25 -0500
>> Andrew Deason <adeason@sinenomine.net> wrote:
>>
>>> And as has been mentioned elsewhere in the thread, you need to wait for
>>> the VG hierarchy summary scan to complete, no matter how fast salvaging
>>> is or how many you do in parallel. That involves reading the headers of
>>> all volumes on the partition, so it's not fast (but it is very fast if
>>> you're comparing it to the recovery time of a 1.4 unclean shutdown)
>>
>> Also, while I keep talking about this, what I haven't mentioned is that
>> it may be solvable. Although I've never seen any code or even a
>> complete plan for it yet, recording the VG hierarchy information on disk
>> would obviate the need for this scan. Doing this would allow you to
>> salvage essentially instantly in most cases, so you might be able to
>> recover from an unclean shutdown and salvage 100s of volumes in a few
>> seconds.
>
> It's also worth noting that in a namei fileserver, each VG is actually
> wholly self-contained, so there is no reason in the world why you should
> have to scan every VG on the partition before you can start salvaging any=
 of
> them. =A0The salvage server design really should take this property into
> account, as it seems likely that some future backends may also have this
> property.
>

We _do_ treat each VG as a separate, concurrently-processed entity.
The problem is the on-disk format's VG membership data leaves much to
be desired--all we have to work with is the parent's volume id in
VolumeHeader_t (in other words, a forest of up-trees).  Hence, given
any arbitrary volume id, you end up performing an exhaustive search to
determine the full membership set of a VG.  This is why we wrote the
VGC in the first place: so you only have to perform that exhaustive
search once.

-Tom