[OpenAFS-devel] prdb format extension for extended authentication names

Jeffrey Hutzelman jhutz@cmu.edu
Fri, 28 Jun 2013 15:42:19 -0400


On Fri, 2013-06-28 at 14:08 -0400, Benjamin Kaduk wrote:
> On Wed, 26 Jun 2013, Jeffrey Hutzelman wrote:

> I think we may be talking slightly past each other; I had two bits for 
> each flag (in different words) -- one is written at step (3), and the 
> second is written when that feature is actually *used* (e.g., when the 
> first extended name or hash table is added, or the first supergroup entry, 
> etc.).  I'm not sure whether you're thinking of the first or the second 
> one, or lumping them together.  (The second one is only really needed if 
> we want to have the ability to live-rollback a version upgrade if the new 
> feature is not actually used, which was mentioned earlier in the thread.)

Right; I was thinking only in terms of the bit that "upgrades" the
database and allows the new feature to be used.  Having a second bit
does present some additional advantages.


> > I was expecting (1) to be done manually, per feature, and (2) to be
> > handled automatically by the server doing the upgrade.  Note that what
> > actually happens in (4) is what I described above -- the server _does_
> > join the quorum, but declines to process any RPCs that require the
> > database.
> 
> I was also expecting (2) to be handled automatically, but it occured to me 
> that perhaps an administrator might want to run the check themself and 
> overrule the system, so I added the other option.  It might be a bad idea 
> if it would leave a slave running which doesn't know about the new format; 
> I'm not sure I traced through Initdb() and {Write,Read}Preamble properly 
> to understand the "keeps running for quorum purposes" behavior.  It 
> *looks* like Initdb() is called before any write or read and would prevent 
> that slave from giving useful responses, but what the failover behavior of 
> clients would be, etc....

Initdb() is called from any RPC that would access the database, every
time.  So if the database is suddenly not a supported version, then RPCs
will fail.  The error is a ptserver-specific error, so the failure is
hard -- the ubik client library will not try another server.

So letting the admin override is "safe", as is upgrading with a quorum
of less than all the servers, or adding a "new" server running old code
that doesn't understand the current database.  But "safe" here means
only that no server will corrupt the database or give wrong results; it
doesn't mean you won't see outages.

In principle, I have no objection to allowing an admin to override, so
long as the normal behavior is for something to check and refuse to
upgrade to a version not supported by all servers.



> >> Some people in this thread were advocating that, going forward, running
> >> dbserver software always know about supergroups (as opposed to being able
> >> to compile away support for them).  This is a behavior change, and I am
> >> not thinking about what its details should be or how to implement it,
> >> right now.
> >
> > That was Simon, and he's right.  We've effectively already revved the
> > database version without actually bumping the version number.  You can
> > decline to support supergroup semantics, but the database fields have
> > been used, and they record relationships.  So, it's not safe for a
> > database with supergroups data to be modified by a server that doesn't
> > have that data.  The solution is that, going forward, we should remove
> > the option to build a dbserver that does not understand this format.
> 
> And then provide a flag to indicate whether it is allowed to be used?
> We would need to be clever to avoid breaking people who currently have it 
> enabled while not force-enabling it on all sites.  (Part of why I didn't 
> want to think about it yet.)

Well, there are a number of options....

We could add a command-line argument to control whether to enable the
feature.  This has two drawbacks.  First, as you note, the default is
not going to be right for everyone who upgrade.  Second, this makes it
possible to set the behavior differently on different servers, which
means the same GetCPS won't produce the same results on all servers.

We could add some kind of flag in the database indicating whether to
enable the feature.  This doesn't affect the database format; it's just
configuration indicating how the admin wants the system to behave.  I
sort of like this idea in general, because it means that policy can be
changed simultaneously on all servers(*) and without restarting.
However, with respect to supergroups, it still have the problem you
describe.

We could bump the version number more than once.  Version 1 would be
today's format without supergroups; version 2 adds supergroups; and
version 3 adds extended names.  All servers going forward would
understand at least version 2.  A new server encountering a version 0
database would have to scan to discover whether the feature is currently
in use, and then upgrade the version to 1 or 2, as appropriate.

Or, for this particular transition, we could simply decide that
supergroups are now a standard feature of OpenAFS.  All code going
forward would understand version 0 to include supergroups, and act
accordingly.  Optionally, we could allow admins to control whether the
server will allow groups to be added to groups (default no); this would
allow upgrading to a new version without the fear that someone will add
a group-in-group relationship which prevents downgrading.  The default
will still be wrong for people already using supergroups, but the
consequence would be a limitation on the ability to make changes;
existing data would be processed correctly.


Personally, I'm in favor of the following:

- The database format changes linearly for now.  To use a new feature,
you must
  have a database version which is new enough to support that feature.
- Database version bumps are explicit, and are managed via a set of RPCs
that
  allow getting the current and maximum supported version and upgrading
to a
  version not newer than the maximum.
- Checking version support before upgrading is handled by the pts
command that
  does the upgrade, and a -force flag is provided.
- Database format downgrades must be done offline, if they are possible
at all.

- Reserve some space in the header or elsewhere for storing system-wide
policy,
  including whether certain features are enabled at runtime.  This
doesn't
  affect the database format, only the server behavior.  These knobs are
managed
  via a pair of RPCs that allow getting and setting the value for a
knob, or
  perhaps for a list of knobs (perhaps we even permit "get all knobs").
The
  set of available knobs is a function of the database format.

  I imagine that on the wire, a knob has a 32-bit name and a 32-bit
value.  Of
  course, the range of permitted values will vary from one to another.
If we
  wanted to get fancy, we could provide a way to get descriptions and
permitted
  values.

- Define version 0 to include supergroups.  New servers always
understand the
  database fields used for this purpose and maintain them correctly.
Whether
  supergroups are enabled is controlled by a 3-position knob: 0 means
"no",
  1 means "yes", and -1 means "yes, but they cannot be created".  The
default
  setting is -1.

So new features will generally require a database version bump and a
knob to control whether the feature is enabled.  Knobs could also be
used for other things, like controlling the default group quota for new
users, or whether foreign users are allowed to self-register, or
whatever.

Since this is about managing the on-disk database format and controlling
behavior that is dictated by policy, none of the RPCs involved need to
be standardized.  They should be documented, of course.




(*) Well, all servers participating in the current quorum.  But the
effect on other servers is equivalent to that of any other kind of
database changes.  Out of date servers will give wrong answers until
they bring their database up to date, and we've already decided that's
what we want (otherwise we wouldn't be using ubik_BeginTransReadAny()
everywhere).