[OpenAFS] mysterious afs fileserver issue

Nicholas Basila nbasila@bottlecapnotes.com
Wed, 24 Oct 2001 16:18:03 -0400


Hi,

    Today we had serious problem with our AFS cell. We're running
OpenAFS 1.1.1a on three sun E220s (all running Solaris 7, 64 bit). We
have several sparc boxes (all Ultra 10) running the AFS client (1.1.1a).
For the last couple of days, they've been experiencing high loads ...
init is taking more processor time than normal. We were in the process
of tracking down the problem when suddenly, one of our three AFS servers
(the control server, actually) had a fileserver process using about 97%
of the cpu. The load jumped up rather high. We don't have many users on
that server, maybe 20. I tried to restart the bos server and all the
other servers on it, but it would hang trying to stop it. I ended up
doing a shutdown -i6, but that also hung (trying to stop AFS, I would
imagine). I ended up sending it into PROM from the serial console and
synched and rebooted. The server is fine now (after it ran a salvage
operation on its AFS partition), and the clients don't seem to be
experiencing quite the load they were before. I didn't see anything
noticeable in any of the AFS or system logs. Has anyone experienced
anything like this?

Thanks,

Nicholas