[OpenAFS] openafs 1.4: kaserver crashes every 5 minutes on AIX 5.2

Ernst Jeschek jeschek@wu-wien.ac.at
Thu, 27 Oct 2005 12:33:36 +0200


On Thu, Oct 27, 2005 at 11:44:25AM +0200, Horst Birthelmer wrote:
> I'm running db servers on AIX 5.2, too, and they're working. I can't  
> think of why a FiveMinuteCheckLWP would cause a crash.
> What I'm trying to say, is, that's pretty weird, but that's what all  
> bugs are :-)

My thought, when I first discovered this :-)

> You're sure it works until the first 5 min. check?
> Does it say anything in the logs during startup?
> What does udebug say about the quorum during those 5 minutes?

Yes, it works. Even the crashes don't seem to do any harm. They
aren't even noticed by the clients.

The output of udebug and the logs both look normal:

| root> bos getlog goya AuthLog
| Fetching log file 'AuthLog'...
| kerberos-iv/udp port=750
| kerberos5/udp is unknown; check /etc/services.  Using port=88 as default
| 005 Using level crypt for Ubik connections.
| Thu Oct 27 12:12:32 2005 Using 137.208.3.33 as my primary address
| Thu Oct 27 12:12:33 2005 Starting to process AuthServer requests
| Starting to listen for UDP packets
| start 5 min check lwp
| 
| root> udebug goya k
| Host's addresses are: 137.208.3.33 
| Host's 137.208.3.33 time is Thu Oct 27 12:08:00 2005
| Local time is Thu Oct 27 12:08:00 2005 (time differential 0 secs)
| Last yes vote for 137.208.3.33 was 8 secs ago (sync site); 
| Last vote started 8 secs ago (at Thu Oct 27 12:07:52 2005)
| Local db version is 1130395706.55
| I am sync site until 49 secs from now (at Thu Oct 27 12:08:49 2005) (8 servers)
| Recovery state 1f
| Sync site's db version is 1130395706.55
| 0 locked pages, 0 of them for write
| 
| [...]

I thought, maybe some compiler options (optimization?) are causing
this.  But: Why only on the sync site? (This is the machine, where
the whole thing was built.) I will try to recompile tomorrow.

Maybe it's a problem of the AIX installation of this machine, but
why do the other server processes (and the whole machine) work
flawlessly?

regards,
ernst jeschek

-- 
Ernst.Jeschek@wu-wien.ac.at                   Fax: +43/1/31336/904105
Zentrum fuer Informatikdienste, Wirtschaftsuniversitaet Wien, Austria