[OpenAFS] Re: Restoring a RW volume that had replicas

Jeff Blaine jblaine@kickflop.net
Tue, 03 Apr 2012 16:52:17 -0400


On 4/3/2012 4:16 PM, Andrew Deason wrote:
> On Tue, 03 Apr 2012 15:50:53 -0400
> Jeff Blaine<jblaine@kickflop.net>  wrote:
>
>> There should have been no myvol (or id 12340) in the VLDB when
>> the 'vos rename' ran, from what I understand.
>
> But you still had replicas on other sites, right? If you have

Yes.

> 'myvol.readonly' vols, then 'myvol' also exists in the vldb. Volumes
> like 'myvol', 'myvol.backup', 'myvol.readonly' etc aren't really
> separate entries. There is one entry in the vldb for 'myvol', and the
> vlserver records the RW, RO, BK, etc volume ids for it.
>
> I think the RW id is always set and you can't get rid of it (even if
> there are no sites where the RW is present), but I'm not sure.

Ah HA.

>>> If you want to restore under the original volume name and id number,
>>> 'vos restore' to 'myvol' directly with -name and -id.
>>
>> Let's say I must restore to myvol.R
>
> Well, I don't think we provide any way to change the volume id number,
> and I'm not sure how feasible/advisable doing that would be, since a lot
> of things can go wrong.
>
> But you have some options. You can remove the replicas (you may need a
> 'vos delentry' as well; I'm not sure), then rename the volume, and add
> the replicas back and release. The volume ID number will have changed,
> though, and any clients using that volume will need an 'fs checkv'
> before they can use it again (or wait 2 hours).

This is what I did, and then dealt with the ensuing "Oh crap,
/usr/rcf/bin/ALL_USER_SHELLS just went away on a bunch of
hosts ...", while hastily feeding a "fs checkvol" into our
bi-hourly config management tool which runs on all hosts
... then waiting for it to run.

Ahem. Live and learn.

> Or you can 'vos dump myvol.R | vos restore -name myvol -id<theid>'. If
> you're doing this to a server that has a replica, you really want to do
> it on the same partition as the extant RO (we try to prevent you from
> doing otherwise, but I'm not sure if all edge cases are caught; in past
> versions we have missed some). Note that when you release, this should
> cause a full release, since doing a restore can screw up our tracking of
> the incremental data to send, etc.

That would have likely been more pleasant.

Thank you for the replies!