[OpenAFS] Problems adding a new server encryption key.

Derrick J Brashear shadow@dementia.org
Tue, 16 Sep 2003 14:05:00 -0400 (EDT)

I'll pick one place to reply, and punt the other. No code to be developed.

On Tue, 16 Sep 2003, Renata Maria Dart wrote:

> Lost contact with file server 134.79.17.xx in cell slac.stanford.edu (all
>  multi-homed ip addresses down for the server)
>     began appearing in our SYSLOG output.  I watched as each of our
>     fileservers in turn stopped serving files, as each one got the new copy
>     of the KeyFile.


> My questions are:
> 1.  Is the Transarc procedure for updating server keys supposed to
>     work under OpenAFS?  Or is a restart of the db and fileservers
>     now needed after a new key is added to the KeyFile?   After the
>     incident described above we went through the archives and found
>     mail from Derrick Brashear in response to Frederick Gilbert:
>     http://www.mail-archive.com/openafs-info@openafs.org/msg07515.html
>     in which a "stuck fileserver" situation is described, but in that
>     case it was after the bos addkey AND kas setpasswd had both been
>     done.  In our case, I never got to the kas setpasswd step.

kas setpasswd not relevant. The bos addkey was, it triggered a bug I added
while fixing another bug. I know it's fixed in 1.2.10. Frederic Gilbert at
some point said something about still having  problem after, but I'm not
sure what it is.

However, the fix to this issue was a patch to src/auth (probably
src/auth/cellconfig.c) which went in recently, probably 1.2.10.

> 3.  If we now need to restart the servers after a bos addkey, can you
>     tell us why?

As above, bug. An error caused an exit without dropping a mutex.
> 4.  Could the KeyFile have been corrupted and still present a normal

It wasn't corrupted.