[OpenAFS] Resilience

Tue, 02 Jun 2009 05:33:45 -0700

Wheeler, JF (Jonathan) wrote:
> One of our (3) AFS servers has a mounted read-write volume which must be
> available 24x7 to our batch system.  The server is as resilient is we
> can make it, but still it may fail outside normal working hours for some
> reason.  For technical reasons related to the software installed on the
> volume it is not possible to use read-only volumes mounted from our
> other servers (the software must be installed and served from the same
> directory name), so I have devised the following plan in the event of a
> failure: 
> 
> a) create read-only volumes on the other 2 servers, but do not mount
> them; use "vos release" whenever the software is updated
> b) in the event of a failure of server1 (which has the rw volume), drop
> the existing mount and mount one of the read-only volumes (we can live
> with the read-only copy whilst server1 is being repaired/replaced) in
> its place.
> 
> Can anyone see problems with that scenario ?  We could use "vos
> convertROtoRW"; how would that affect the process ?

Volumes "app" and "app.readonly" have not only different names but
different volume ids.  Once an application has opened a directory
or file on "app" it will continue to try to access the "app" volume
even after the "app" volume is no longer present.  Changing a mount
point to refer to "app.readonly" will only affect future attempts
to evaluate the path that resulted in "app" being accessed.

The behavior you are looking for requires that the client believe
that there is an alternate location for the "app" volume to failover
to.  In other words, you require that there be read-write replicas.
This support does not currently exist.

Using convertROtoRW will not provide the equivalence of a read-write
replica because the client that is currently accessing the "app"
volume on the one and only file server that the vlserver claimed
the volume is located on.  When the vldb is updated, the client will
not receive any indication that the change occurred.  During a volume
move operation that notification would have come from the file server
from which the volume was being moved.  Of course, that file server
is no longer responsive and is not involved in the move.  Volume
location information is valid for two hours but can be manually
invalidated on the client using the "fs checkvolumes" command.

During the 2009 Google Summer of Code some progress was made towards
implementing a read-write replication model.   If you are aware of
resources that could be contributed to help complete this effort,
please contact the OpenAFS gatekeepers.

Jeffrey Altman