[OpenAFS] Re: DB servers "quorum" and OpenAFS tools

Jeffrey Hutzelman jhutz@cmu.edu
Fri, 24 Jan 2014 11:41:35 -0500


On Fri, 2014-01-24 at 08:01 +0000, Simon Wilkinson wrote:
> On 24 Jan 2014, at 07:48, Harald Barth <haba@kth.se> wrote:
> 
> > You are completely right if one must talk to that server. But I think
> > that AFS/RX sometimes hangs to loooooong on waiting for one server
> > instead of trying the next one. For example for questions that could
> > be answered by any VLDB. I'm thinking of operation like group
> > membership and volume location.
> 
> I have long thought that we should be using multi for vldb lookups,
> specifically to avoid the problems with down database servers. The
> problem is that doing so may cause issues for sites that have multiple
> dbservers for scalability, rather than redundancy. Instead of each
> dbserver seeing a third (or a quarter, or ...) of requests it will see
> them all. Even if the client aborts the remaining calls when it
> receives the first response, the likelihood is that the other servers
> will already have received, and responded to, the request.
> 
> There are ways we could be more intelligent (for example measuring the
> normal RTT of an RPC to the current server, and only doing a multi if
> that is succeeded) But we would have to be very careful that this
> wouldn't amplify a congestive collapse.

The thing is, the OP specifically wasn't complaining about the behavior
of the CM, which remembers when a vlserver is down and then doesn't talk
to it again until it comes up, except for the occasional probe.

The problem is the one-off clients that make _one RPC_ and then exit.
They have no opportunity to remember what didn't work last time.  It
might help some for these sorts of clients to use multi, if they're
doing read-only requests, and probably wouldn't create much load.
However, for a call that results in a ubik write transaction, I'm not
entirely sure it's desirable to do a multi call.  That will require some
additional thought.


In the meantime, another thing that might be helpful is for clients
about to make such an RPC to query the CM's record of which servers are
up, and use that to decide which server to contact.  A quick VIOCCKSERV
with the fast flag set could make a big difference.

-- Jeff