[OpenAFS] ubik problem in a mixed IBM AFS/OpenAFS environment

Dexter 'Kim' Kimball dhk@ccre.com
Mon, 10 May 2004 14:51:41 -0600


There's a reason for requiring a quorum ...

Say you've got 3 db servers and the sync site  (DBM1) fails.

One of the other two DBMs (DB Machine) becomes the sync site.

Any changes are propagated to 2 of the 3 DBMs.

DBM1 db version:  1
DBM2 db version: 2
DBM3 db version: 2

If DBM2 goes down while DBM1 is still down,  DBM3 can't be the sync site
(highest IP assumed).  If DBM1 comes back, it becomes the sync site and gets
the most recent db version from either DBM2 or DBM3, whoever comes back
first.  Until one of the others comes back, DBM1 is not a sync site and
won't accept changes.



OR (if one site is up it can be the sync site, quorum be d***ed):

 DBM1 goes down.

DBM2 becomes sync site and accepts a change.

DBM1 db version:  1
DBM2 db version: 2
DBM3 db version: 2

Oops. DBM2 goes down.  Under the new rules, DBM3 _can_ be the sync site,
accepts a bunch of changes no other DBM knows about, db version goes to
humpty eleven.

DBM1 db version: 1
DBM2 db version: 2
DBM3 db version: humpty eleven

Then DBM3 goes down. DBM1 is still down.

DBM2 comes back up as the only DBM and accepts a bunch of changes that DBM1
and DBM3 don't know about.  db version goes to humpty twelve.

DBM1 db version: 1
DBM2 db version: humpty twelve
DBM3 db version: humpty eleven


We no longer have an authoritative db

This is evil.

AFS clients requesting VLDB info, for example, could get different locations
from different VL servers -- and some of that info would be wrong, causing
the client to puke, hurl, and barf (well, fail to access the volume).
Or I change my password.  Only DBM2 knows about the new password.
Unfortunately klog chooses a server at random from the CellServDB, so on
average I get tokens once every three attempts.  Did I say I mystify
readily?


Bear in mind that there's no way to merge the changes made to the rogue sync
sites DBM2 and DBM3 (no transaction logs are kept. Feasible yes, implemented
no, worth implementing -- not in my experience).


As long as a quorum is required we don't lose updates.  If two of three go
down, we don't accept updates.  If two of three are up, we do accept
updates -- and when the third comes back it will be forced up to date from
the single prevailing db version.

Does this make sense?  (I'm going to stop here.  I see the beginning of a
circle looming large.)

Kim



=================================
Kim (Dexter) Kimball


On 5/10/2004 2:11:27 PM, Nicolescu, Edward L (edward@bnl.gov) wrote:
> Jimmy,
>
> thanks. I was wrongly expecting my two-server scenario to unfold as if
> I had only the two of them mentioned in CellServDB, not three... My
> oversight...
>
> On another hand, it would be nice to have an algorithm allowing the
> selection of a sync-site even when 2 out of the 3 db servers go down.
>
> Thanks again.
>
> Edward
>
> > -----Original Message-----
> > From: jimmy@e.kth.se [mailto:jimmy@e.kth.se]
> > Sent: Monday, May 10, 2004 5:30 AM
> > To: Nicolescu, Edward L
> > Cc: openafs-info@openafs.org
> > Subject: Re: [OpenAFS] ubik problem in a mixed IBM AFS/OpenAFS
> > environment
> >
> >
> > "Nicolescu, Edward L" <edward@bnl.gov> writes:
> >
> > > Hi everybody,
> > >
> > > here is the problematic configuration:
> > >
> > > 2 afs db servers,
> >
> > You have 3 DB-servers at the current timne.
> >
> > rafs01.rcf.bnl.gov
> > rafs02.rcf.bnl.gov
> > rafs03.rcf.bnl.gov
> >
> > Do you want to disable 1 or 2 of them ?
> >
> > > as expected, given the lower ip address. However, shutting
> > down the db
> > > server processes on the