[OpenAFS] OpenAFS with RAID

Christof Hanke hanke@rzg.mpg.de
Thu, 29 Dec 2005 12:28:35 +0100

Jeffrey Altman wrote:
> Stephan Wiesand wrote:
>>Wouldn't it be an option to not take over the IP address, but just the
>>vice partition? Once failure of the peer is recognized and confirmed
>>(which is a problem, I agree, but not at all AFS-specific):
>>  1) stonith
>>  2) mount the new vice partition and salvage it
>>     [ 2a) is there a need to restart the fileserver? ]
>>  3) vos syncvldb
> You would also require a
>     4) vos syncserv
>>There must be some thinko in all this, or people would be doing this a
>>lot. What is it I'm overlooking?
> I believe this scenario will not work because the VLDB entries for all
> of the volumes that are being mounted by Server B are listed as being
> on Server A.  Since Server A is unreachable, the volume server when
> performing the "vos syncvldb" and "vos syncserv" steps will not be able
> to verify that the volume is no longer on Server A.  Therefore, there
> is now an "irreconcilable conflict" that will cause the vos command to
> "write a message to the standard error stream."  The vos "command never
> removes volumes from file server machines."  The quotes are from the man
> pages for "vos syncvldb" and "vos sycnserv".

I guess, you con circumvent this, by moving the sysid as well.
The hotswap may then work as follows:
all /vicep* and /usr/afs (tranarc paths assumed) are on an external storage.
If/When fileserver A breaks, you shut it down.
A standby node B, with the same OS than fileserver A  (which is up to 
this place _no_ fileserver) mounts the stuff on the external storage. 
Then you startup the fileserver binary on that.