[OpenAFS] Re: perl afs module question

David Botsch dwb7@ccmr.cornell.edu
Wed, 16 Apr 2003 11:01:23 -0400


Here's the problem with this explanation, I am not doing anything en 
masse. We are talking one transaction... just a user attempting to 
change his password via a web page (which is what this code is designed 
to do). So, the server should not be getting confused in the manner the 
threat to which you referred indicates.

I thought this error was being generated by the fact that 
&KTC::TOKEN->kvno = 0 when a kas exa on said user shows a nonzero key 
version number. Maybe I'm not understanding exactly what is happening 
here.

Unfortunately, I don't see anything in the afs server logs to help 
track this down.

thnx!

On 2003.04.16 06:26 Norbert Gruener wrote:
> Hi David,
> 
> On Tue, Apr 15 2003, David Botsch wrote:
> >
> > I have been trying to use your perl afs module to allow a user to
> > change his or her password.
> >
> > However, it is failing with the following error when I look at
> > $AFS::CODE
> > 	"ticket contained unknown key version number"
> [snipped]
> > Thanks for any insights you can provide!
> 
> in principle I do have the same problem.  But in 1996 there was a
> thread about that problem which explained what is going on.
> 
> -------------------  thread 1996 --------------------------------
> From: Marcus Watts <mdw@umich.edu>
> To: schemers@stanford.edu
> Cc: info-afs@transarc.com
> Subject: Re: unknown key version number...
> Date: Wed, 31 Jul 96 21:56:06 -0400
> 
> schemers@stanford.edu writes:
> 
> >
> > Hi. We have some account creation scripts that run out cron every
> > night (in a pagsh) that grab an admin token and start creating
> users,
> > volumes, etc. After a while the script fails with the following
> error:
> >
> > Creating user xxxxxxx  : [rxk] ticket contained unknown key version
> number
> >
> > Anyone know why we get this error?
> 
> and
> 
> > I also forgot to mention that each create is done using
> > "kas create ... -pass ..." (its on a secure server), so every
> > create gets a new token. I'll probably rewrite things to use
> > a custom "kas" command that grabs the admin DES] key from a srvtab
> and
> > creates multiple accounts in one fail swoop, checking for errors and
> > getting a new token if need be.
> >
> > I'm still puzzled as to why the "kas create" command would fail
> since
> > each creation is done with a fresh token.
> 
> That message sounded so familiar, and now I know why!  (I think...)
> 
> It definitely has to do with running lots of those commands right in a
> row.  There are, indeed, some vaguely evil things about all this.
> The first is that each kas command is creating a separate connection.
> This is, actually, the root of the problem.  Those connections
> don't go away when the kas command exits, but hang around for
> "a while" in kaserver, consuming bits of server memory & such in the
> meantime.
> That isn't so bad in itself (other than slowing the server down), but
> I recall some other problems somewhere, that caused the server to
> somehow
> eventually become "confused" about which connection a packet belongs
> to.
> The end result is that you're getting that message because an old
> useless
> connection is snagging the packet and becoming unhappy.
> 
> There is also another problem, somewhere in ubik, that means when you
> do tons of back to back operations, eventually, one of them is going
> to hit a bad timing case, and not work.  I vaguelly recall a fix in
> 3.4
> that "improves" this, but doesn't make it perfect.
> 
> So, the following two things will definitely help:
> 
> 	(1) batch the kas operations up, and run a bunch of
> 	them with one "kas" command.  Don't run "too many",
> 	because while they're running, you are hurting your
> 	kaserver's performance.  10-30 may be a good number,
> 	depending on the size of your cell.  Sleep a while
> 	between each batch.  This should eliminate the unknown
> 	key version problem, but you will still see other
> 	occasional problems.
> 
> 	(2) look for failures, & retry them, perhaps after a
> 	suitable short delay.
> 
> The "custom" program is definitely a useful approach.  At UM,
> uniqname is our answer to the whole problem of dealing with
> the whole mess.
> 
> The "pts" command doesn't come with an "interactive" mode, unlike
> kas, so it's not so easy to batch "pts" commands up.  We ended
> up adding an "interactive" mode to pts, & a "sleep" command, so
> that we could run scripts that add lots of users to groups.  Also,
> our ptserver uses a "ubik" that is just a little bit different...
> 
> 				-Marcus Watts
> 				UM ITD PD&D Umich Systems Group
> 
> From: auvenj@vnet.ibm.com
> To: info-afs@transarc.com
> Subject: unknown key version number...
> Date: Wed, 31 Jul 96 13:20:52 PDT
> 
> 
>  >Hi. We have some account creation scripts that run out cron every
>  >night (in a pagsh) that grab an admin token and start creating
> users,
>  >volumes, etc. After a while the script fails with the following
> error:
>  >
>  >Creating user xxxxxxx  : [rxk] ticket contained unknown key version
> number
>  >
>  >Anyone know why we get this error?
>  >
>  >thanks, roland
> 
>  I can't help with the "why" but I can say that we have received this
> error
>  also when creating accounts with a shell script.  What we had to do
> was
>  parse the output for this error and, if it occurred, try the
> operation again
>  up to 10 times after a sleep of 10 seconds.  This seemed to give us
> reliable
>  functionality.
> 
>                            ...Jason Auvenshine
>                            (auvenj@vnet.ibm.com)
> 
>  ISSC Tucson/San Jose AFS Team
> -------------------  thread 1996 --------------------------------
> 
> So my solution to this problem is the following
> 
>    - create a KAS instance: $kas = AFS::KAS->AuthServerConn(...);
> 
>    - do your KAS action: $ok = $kas->ChangePassword(...)
> 
>    - check the return code if the action was unsuccessful
> 
>    - if so, destroy your KAS instance and sleep for 5 seconds: undef
> $kas; sleep 5;
> 
> I loop over these four steps for maximum 5 times.  Then I abort that
> task if it still was not successful.  With this procedure that problem
> has never again shown up.
> 
> I hope this helps you to solve your problem.  Otherwise you should
> send my your script and I will have a look into it.
> 
> Cheers,
> 
> Norbert
> --
> Ceterum censeo          | PGP encrypted mail preferred.
> Redmond esse delendam.  | PGP Key at www.MPA-Garching.MPG.de/~nog/
> 

-- 
********************************
David William Botsch
Consultant/Advisor II
CCMR Computing Facility
dwb7@ccmr.cornell.edu
********************************