[OpenAFS] Re: Minor question on moving AFS db servers

Andrew Deason adeason@sinenomine.net
Wed, 29 Oct 2014 15:28:13 -0500

On Wed, 29 Oct 2014 19:02:10 +0100
Jan Iven <jan.iven@cern.ch> wrote:

> On 10/29/2014 06:47 PM, Garance A Drosehn wrote:
> > The oddity is that during the time that the AFS processes were not
> > running on either machine, AFS access on many of our AFS clients
> > was pretty slow.  Everything worked, but much slower than normal.
> > I'm pretty sure the delay was all in the lookup-step, and that if
> > some AFS client already had a file open in AFS then I/O performance
> > to that file was fine.

You should see a message in syslog that we detected a vlserver was down;
once that happens, you shouldn't have seen any more slow behavior (but
bugs are of course possible :)

> AFS clients-as-in-userspace tools (vos exa, pts) will contact a random 
> DB server each time, so in your case have 1/4 chance of waiting (no 
> "learning" over several invocations).

Fileservers are another type of "userspace client", since they do
contact dbservers. The most frequent way they contact dbservers is when
they receive a new connection (usually because someone 'aklog'd), in
order to calculate group membership, which is not very often.

I just mention this because I don't think there's any way to avoid this
one. Other userspace "clients" will not notice because they are
short-lived processes, but anything that's long running, we don't have a
way to notify of CellServDB changes. So the fileserver just has to
notice that the ptserver it's trying to reach is down; but the
fileserver doesn't try to do this _that_ much, so for a short transition
it can effectively be avoided.

> AFAIK there is no gentle way to pre-announce "this one is going away". 
> You could push a new CellServDB before every update, and run "fs 
> setserverprefs -vlservers" to penalize the machine that is going away 
> (or restart the AFS clients), but in our case we didn't do this.

Some people will just 'fs newcell' to update the in-memory list when
doing a CellServDB change. Using setserverprefs is probably better, but
either way; that's how you avoid this problem.

Andrew Deason