[OpenAFS-devel] Problems adding a new server encryption key.

Renata Maria Dart Renata Maria Dart <renata@SLAC.Stanford.EDU>
Tue, 16 Sep 2003 10:11:04 -0700 (PDT)


Hi, yesterday we attempted to update our AFS server encryption
keys.  We have done this procedure a dozen or so times under
Transarc AFS with minimal problems.  Yesterday was our first time 
trying it under OpenAFS and things did not proceed as expected.

All of our database (3) and fileservers (8) are solaris 9 running 
OpenAFS 1.2.9.   We always keep 2 server keys in our KeyFile and we 
change them every 6-8 months.  During the update process we retire
the key that was current 6 months ago.  According to the Transarc 
documentation of this procedure, the first steps, which involve 
updating the KeyFile, should not require a restart of any server 
processes.  Only when you update the afs entry in the auth database 
is a restart of the db servers needed.  Yesterday I did the following:

1.  I did a bos remove of the lowest numbered key in the KeyFile on
    the db server which runs upserver of /usr/afs/etc, leaving only 
    one key.  The one key left in the KeyFile matched the afs entry 
    in the auth database.  I watched with bos listkeys as this change
    propagated to our other servers.  No problems yet.

2.  I generated a new random key and used bos addkey to add it to the
    KeyFile on that same server.  As soon as I did this, messages like:

Lost contact with file server 134.79.17.xx in cell slac.stanford.edu (all
 multi-homed ip addresses down for the server)
 
    began appearing in our SYSLOG output.  I watched as each of our 
    fileservers in turn stopped serving files, as each one got the new copy 
    of the KeyFile.  
    
    While clients could no longer see files in AFS, I could successfully
    talk to the db server processes with commands like vos listvldb,
    kas exam, and pts exam, and the fileservers would respond to  
    commands like vos listvol and vos partinfo.   I could also klog
    and get a new token.
    
At this point we spent some time speculating about what had happened and 
how to fix it.  We were concerned that the new key might have corrupted
the KeyFile, despite the fact that bos listkeys always produced normal
output that matched on all of the servers.  So we backed out the new
key, using bos removekey, leaving only the one "current" key which 
matched the afs entry in the auth database.  That didn't help.  What 
finally fixed the situation was to restart all of the db servers, 
and then restart all of the fileservers.

My questions are:

1.  Is the Transarc procedure for updating server keys supposed to
    work under OpenAFS?  Or is a restart of the db and fileservers
    now needed after a new key is added to the KeyFile?   After the 
    incident described above we went through the archives and found 
    mail from Derrick Brashear in response to Frederick Gilbert:
    
    http://www.mail-archive.com/openafs-info@openafs.org/msg07515.html
    
    in which a "stuck fileserver" situation is described, but in that
    case it was after the bos addkey AND kas setpasswd had both been
    done.  In our case, I never got to the kas setpasswd step.
    
    And Nathan Neulinger asked some questions about this process in:
    
    https://lists.openafs.org/pipermail/openafs-devel/2001-January/000480.html

    I couldn't find a response to Nathan and I couldn't find an open
    bug that might be related to Frederick Gilbert's mail.

2.  If a restart is now necessary, is there some subset of processes
    that would suffice rather than restarting all of our servers using    
    -bosserver, which is what I did.

3.  If we now need to restart the servers after a bos addkey, can you 
    tell us why?  
    
4.  Could the KeyFile have been corrupted and still present a normal
    response with bos listkeys?  


Thanks for your help,

-Renata


 Renata Dart                         | renata@SLAC.Stanford.edu  
 Stanford Linear Accelerator Center  |    
 2575 Sand Hill Road, MS 97          | (650) 926-2848 (office)
 Stanford, California   94025        | (650) 926-3329 (fax)