[OpenAFS] Replication, Fail-over, Disconnected Operation and Caching

Fri, 25 Jan 2008 17:20:02 +0000 (GMT)

Hi,

I've looked through the documentation, but couldn't find any specifics on 
this, so I'd be grateful if somebody could point me at the page I've 
missed.

1) How do OpenAFS clients pick a server to access a volume from if the 
volume is replicated on multiple servers?

2) From the documentation, it looks like the replication mechanism is 
single-master / multiple-slaves, i.e. one read-write server, multiple 
read-only servers. Is that correct? If so, do clients transparently handle 
this? Are writes transparently routed to the read-write server while still 
allowing reads to come from a more local, faster, read-only server.

3) Can the root volume be replicated? What I am really looking to do is 
have 2 servers, one as master and the other with all the volumes 
replicated. Is that possible?

4) If the read-write server fails, how does OpenAFS handle failing over to 
the replicated backup? When the original master comes back up, how 
transparently / gracefully does this happen?

5) Is disconnected operation supported via local caching (as per Coda)? If 
so, are there limits on sane cache sizes? Is it reasonable to expect to 
have tens of GB of cached content available on the client nodes?

I am currently using GFS in reliable environments, and Coda on a small 
scale in environments that have to tollerate disconnections, but I have 
concerns about Coda's stability (perpetual betaware, or so it seems) in 
larger and harsher environments (terabytes of storage, hundreds of 
clients, thousands of users), hence why I am looking at OpenAFS as a 
possible more stable alternative.

Thanks in advance.

Gordan