[Fwd: Re: [OpenAFS-devel] prdb format extension for extended authentication names]

Jeffrey Hutzelman jhutz@cmu.edu
Wed, 26 Jun 2013 16:48:03 -0400


-------- Forwarded Message --------
From: Jeffrey Hutzelman <jhutz@cmu.edu>
To: Benjamin Kaduk <kaduk@mit.edu>
Cc: jhutz@cmu.edu
Subject: Re: [OpenAFS-devel] prdb format extension for extended
authentication names
Date: Wed, 26 Jun 2013 13:57:32 -0400

On Wed, 2013-06-26 at 12:58 -0400, Benjamin Kaduk wrote:
> On Wed, 26 Jun 2013, Simon Wilkinson wrote:
> 
> >
> > On 25 Jun 2013, at 00:57, Benjamin Kaduk wrote:
> >> So, we would introduce a flags framework with a DBVERSION bump, and then allocate flags for new features?
> >
> > This really does feel like it is unnecessary complexity for now. Until 
> > lots of people start proposing database format changes, a single 
> > monotonic version number should serve OpenAFS fine.
> 
> That's fair.  No need to make too much extra work for ourselves.
> 
> Do I then conclude that my proposal is fine, except that it should bump 
> DBVERSION to 1?  We would need to resolve the details of how an update 
> would occur, if we don't want to require simultaneous update of the 
> software on all dbservers, of course.

Yeah, this is a real problem.  The PRDB is a distributed database so I
can run three of them and not have to take user-visible outages.  Being
forced to do so to take an update would not be OK.

However, I think we can go back to what we were discussing earlier.  To
bump to a new database version, you must first upgrade all servers to
something new enough to support that version.  Then you take an action
(presumably, make an RPC) that bumps the version.  The following
restrictions apply...

1) A server (coordinatoir) will refuse to upgrade the database version
to a version it doesn't support.  So, you can't upgrade to a db version
unless you have at least one server that supports that version.

2) A server will refuse to function if the database version is not one
that it supports.  This is already true -- Initdb() fails with PRDBBAD
if the database version is not recognized.  Note that the server will
continue to _run_, which means you can have enough servers to maintain a
quorum, even if some of them cannot provide service.



> >> I guess we could steal a couple of words in the header to indicate (respectively) "feature supported/enabled by the running quorum" and "feature in use in this database".
> >
> > The problem is that you don't know the feature set supported by the 
> > running quorum. Only the master can write to the database - so even if 
> > that updates the database with its feature set every time it is 
> > restarted, the slaves get no say. One of the challenges with Ubik is 
> > that there is currently no mechanism to do configuration negotiation 
> > during a quorum election. So there's no way to notice during that 
> > election that a slave's configuration is non-standard.
> 
> Perhaps I am confused, but I was expecting that:
> (1) something (perhaps administrator action) causes the master dbserver to 
> decide that it should try to enable a feature flag.
> (2) something (maybe the administrator, maybe the master dbserver itself) 
> calls an RPC against each server in the quorum, individually, to query 
> support for that feature.
> (3) If all servers in the quorum report success, the master dbserver makes 
> a write indicating that the feature is enabled
> (4) subsequent slave dbservers attempting to join that do not support the 
> indicated feature notice that the flag is set and decline to join the 
> quorum.

You're not confused.  That's exactly what I expected -- the feature
flags describe features that are used in the database, not features that
some server supports.  In other words, they work just like the feature
flags in a filesystem, except that you have to be more careful about
turning one on, since there are multiple servers using the same database
at once.

I was expecting (1) to be done manually, per feature, and (2) to be
handled automatically by the server doing the upgrade.  Note that what
actually happens in (4) is what I described above -- the server _does_
join the quorum, but declines to process any RPCs that require the
database.

> >> I am deliberately not making a concrete proposal for how to handle the 
> >> supergroups transition right now;
> >
> > There isn't a "supergroups transition". The current supergroups 
> > implementation is an unavoidable fact of our current database version - 
> > we have to assume that all of the supergroups fields are occupied in all 
> > databases with that version, and our recovery tools have to deal with 
> > the imperfections of the way supergroups are currently stored.
> 
> Some people in this thread were advocating that, going forward, running 
> dbserver software always know about supergroups (as opposed to being able 
> to compile away support for them).  This is a behavior change, and I am 
> not thinking about what its details should be or how to implement it, 
> right now.

That was Simon, and he's right.  We've effectively already revved the
database version without actually bumping the version number.  You can
decline to support supergroup semantics, but the database fields have
been used, and they record relationships.  So, it's not safe for a
database with supergroups data to be modified by a server that doesn't
have that data.  The solution is that, going forward, we should remove
the option to build a dbserver that does not understand this format.



> > I think all of this only really comes into play when we're discussing 
> > how to store GSS names within the database. And it seems to me that the 
> > simplest way of doing that is with a controlled version number bump.
> 
> Okay.  There's nothing wrong with setting DBVERSION=1 for extended names, 
> and then later deciding that we want to move to flags as DBVERSION=2, as 
> far as I can tell.  Maybe jhutz disagrees, but we'll see. :)

No, that's fine.  The requirements for dealing with version transition
are the same either way.  Feature flags just give you a small-diameter,
many-dimensioned version space instead of a large-diameter
one-dimensional one.

-- Jeff