[OpenAFS] more newbie questions

Jonathan Dobbie jonathan_dobbie@mcad.edu
Fri, 23 Feb 2007 12:12:27 -0600

>> We only have one small chunk or data that (I think) lends itself  
>> to a RO replica.  We have a network library that it automounted by  
>> all osx computers.  All other data is updated enough that people  
>> wouldn't want to wait for me to release  it.  Am I missing a way  
>> to set up RO replicas?  I'd be nice if they would mirror changes  
>> automatically.  Part of what I want is to be able to have any one  
>> piece of hardware die, and either route around it automatically,  
>> or bring it back up remotely.
>> Here is my current idea (I'm not hugely fond of it, so I'm really  
>> hoping that someone has a better one)  There will be two FC  
>> storage devices (we currently have one xraid.  If we can't get  
>> much cash, it'll be another, if not, something better.)  These  
>> will be kept in sync with DRBD, at least at the partition level  
>> (which seems a little silly)  Heartbeat will be used so that if  
>> anything goes wrong with the server or the storage, the other  
>> server will restart its AFS server and start serving the downed  
>> server automatically.  (It'll certainly end up more complicated  
>> than this, but that's the basic idea).
>> Could someone please point out the holes in this plan?  Is there a  
>> simpler way to do this with R/O replicas that might require me to  
>> manually promote the replica to R/W, but would be less error  
>> prone?  Most of the data involved is home directories and  
>> departmental shares.  If it can be fixed remotely in <5 minutes,  
>> it's probably good enough.
>> I keep thinking that there should be a clever way to use GFS(not  
>> google, the RH one) instead or DRBD to keep the volumes in sync.   
>> All of the machines have two gigabit NICs, but it still seems like  
>> a waste not to use FC.
>> More precisely, would this be possible:
>> /vicepd is on GFS on both RAID arrays (A and B)
>> it's mounted on servers 1(rw) and 2(ro).

Okay, I did some more research and convinced myself that this is  
definitely not possible (the ro part)

>> If A dies, B serves the data and no one notices
>> if 1 dies, heartbeat promotes 2 to rw and ro.
>> and, if it is possible, what would users notice?
> I've read other people's remarks that syncing /vicepx is bad, but I  
> don't know for myself.

I guess it comes down to: what is the best way to have live or nearly  
live failover of user directories?

Am I just being paranoid?  I just have bad memories of my home  
directory going dead at 2 AM while trying to get some horrid verilog  
code to synthesize.

Should I just automate vos release every minute and then do a vos  
convertROtoRW?  That just feels like a dirty hack, and wouldn't I  
still need to do some magic to get the clients to see the new RW  
server? It'd also make it painful when the server came back up.  What  
would happen to changes that occurred after the last release?

I wish that Noora Peura's ideas had made it into OpenAFS, rw replicas  
would be really handy.