[OpenAFS] AFS suggestions

Horst Birthelmer horst@riback.net
Fri, 29 Jul 2005 09:55:22 +0200

On Jul 29, 2005, at 12:14 AM, lamont@scriptkiddie.org wrote:
> On Thu, 28 Jul 2005, Horst Birthelmer wrote:
>> Noora Peura did her masters thesis on that topic. (It was actually  
>> for arla, but I guess it could help anyway)
>> Maybe your student already knows her work ... ;-)
>> The report is at:
>> http://www.stacken.kth.se/~noora/exjobb/files.html
> interesting.  what i'd like to see is just read-one/write-all used  
> in conjunction with the existing ro-clones.  the problem with ro- 
> clones is that you can't be constantly releasing the rw volume to  
> the ro volumes on every write.  if you batch up writes you expose  
> yourself to losing data which has been written due to failure of a  
> single fileserver.  to provide acceptable (for my applications)  
> data integrity, just write-all to 2 or 3 fileservers and on failure  
> of any one of them, use one which didn't have the write fail and  
> replicate it to the ro-clones.  then create a new r/w bucket and  
> keep going.

That won't work. You talked about integrity. That's exactly the key  
word here.

What should happen if one of the fileservers comes back with an old  
version of every file in the volume?
What should happen if a client wrote to a fileserver and that one  
fails? (when we still have a connection to that server and no  
replication...) How are your replicated volumes reacting to that?

Those are just a few very simple scenarios that came to my mind  
without really thinking much about it.

It's a lot _more_ complicated than that and I'm not talking about the  
changes to the protocol and the servers which aren't compatible any  
more so that we can't use servers and clients from different versions.

> for an append-only service writing into volumes the ro-clones are  
> the right model to store the data long-term -- but while the volume  
> is r/w it needs to still be more available than a single fileserver.

The guys who did the design of AFS where neither stupid nor naive.  
They knew what hell it would be to do replication any other way it  
was done ;-)