[OpenAFS] Re: Ubik problem

Marcus Watts mdw@umich.edu
Fri, 16 Apr 2010 17:58:13 -0400

Atro Tossavainen <atro.tossavainen+openafs@helsinki.fi> writes:
> You "know" that?  That's a misassumption at best.
> sun4x_58 # cksum /usr/afs/etc/KeyFile
> 2143645127      100     /usr/afs/etc/KeyFile
> sunx86_510 # cksum /usr/afs/etc/KeyFile
> 2143645127      100     /usr/afs/etc/KeyFile

You say your software said
"ticket contained unknown key version number"
If it said that, a ticket that came from <some other source> contained
a key version number that doesn't match what's in it.

Having said that, no, in fact, I said more than I should have.
I should have said, "we know the key data isn't consistent".

If there's any combination of "-localauth" that doesn't work,
*then* I could correctly say the key files aren't consistent.
*That* smoking gun hasn't been produced yet, and may not exist.

What I *can* say is that the key data between your key files
and your KA database is NOT consistent.
	(because you said:
		ka> exa useraccount
		examine: ticket contained unknown key version number getting information for useraccount.

So, here's how to diagnose that.

Try to reproduce these results:

strawdogs-root# bos listkeys strawdogs -localauth 
key 3 has cksum 1207506455
key 4 has cksum 1508428935
key 0 has cksum 436464802
Keys last changed on Fri Apr 16 02:09:06 2010.
All done.
(do this for each of your hosts.  since you claim your key files
are identical, you *should* see the same results.)

strawdogs-root# kas inter admin -server strawdogs
Administrator's (admin) Password: 
ka> e afs

User data for afs
  key (0) cksum is 436464802, last cpw: Fri Apr 16 02:06:44 2010
  password will never expire.
  An unlimited number of unsuccessful authentications is permitted.
  entry never expires.  Max ticket lifetime 100.00 hours.
  last mod on Fri Apr 16 02:06:44 2010 by <none>
  permit password reuse

(do this for each of your hosts.)

You'll only see one key for afs in each copy of ka.
The key version and checksum should be the same on all servers.  The
cksum and kvno that appear that should match one entry in the keyfile.
You might have other keys in your keyfile.  I have multiple keys
because one matches what I currently have in kerberos 5, and another
is probably old junk.  They won't matter to ka (well, except for ubik.)

If one kadb contains a key in the keyfile, and other kadb's key is not
in the keyfile, then whether you get a working ticket on any particular
request is a 50% venture.  If one host has ubik problems, then that's
going to shift the odds.

You ought to also be able to dump out kadb using "kadb_check".
That's not going to work on your little endian machines.  Since you
have at least one big-endian machine, you could copy kadb to that host,
run kadb_check there, and compare the output between your different
servers.  Once you've found a sane kaserver.DB*, you can move it out of
the way and let ubik replicate it, or if you haven't fixed that, you can
copy it over manually (when kaserver isn't running of course.)

				-Marcus Watts