[OpenAFS] sluggish AFS fileserver and frozen linux clients

Dave Lewis lewis@nki.rfmh.org
Fri, 11 Jan 2002 16:55:48 -0500


Hi,

One of our IBM fileservers (AFS 3.6 server) became extremely sluggish
yesterday for some reason. Our linux clients also froze, even though we
have two other servers. We don’t know what caused the problem; the AFS
logs didn't show anything unusual.

Our setup:
AFS 3.6 fileserver on two IBM RS/6000s (AIX 4.3.3) and OpenAFS 1.1.1
fileserver on a linux box (RH 7.1, 2.4.3-12).
AFS 3.6 clients on a sun (Solaris 5.6) and several linux boxes (RH 6.2,
2.2.14-5.0) and OpenAFS 1.1.1 on three linux boxes (RH 7.1 and 7.2,
2.4.3-12).

We could not even login to the linux clients as root in a virtual console.
Also we could not get back from a screen saver that was running on one of
the linux systems. However, we could login to the sun client as root.

After the fileserver became sluggish we tried to login at the console
of the IBM server, but the login process seemed to hang. We were able
to login to the second IBM fileserver. We did 'bos shutdown <server_1>',
and 'bos status <server_1>' showed that everything shutdown except
for the fileserver -- even an hour after we issued the shutdown command.

We rebooted some of the linux AFS 3.6 clients, and finally the problematic
fileserver shutdown -- maybe coincidentally. Then everything worked
normally.
The login process on the server finally finished. We salvaged all the AFS
volumes on the server, but we didn't see any big errors or clues as to what
caused the problem.

The only major thing that we did recently to the IBM server that had the
problem was that we replaced one of the disks in an external RAID array.

Does anyone know what may have caused this, or what to check if it happens
again?

Thanks,
Dave Lewis