[OpenAFS] SUN E3500 Fileserver hangs again and again ...

Erwin Broschinski broschi@id.ethz.ch
Tue, 17 Sep 2002 11:40:15 +0200 (MEST)


Hi,

We have now (almost) completely replaced IBM AFS with OpenAFS 1.2.6 on
all servers in our cell. Transition was quite smooth, except for one
fileserver: This is a SUN E3500 with 2 Photon-Drives and 5.8 on a recent patch
level. Another 5.8 fileserver with AFS 1.2.6 is a E450 with A1000, running
without problems.


The following happened several times under 1.2.5 and 1.2.6:

Some clients started to freeze when running AFS binaries (klog, tokens).
These binaries are on AFS. I assumed clients had no access to the replicated
volumes. As on previous occasions I bos restarted the suspected fileserver. It
would then hang and not come down. I killed -SEGV the fileserver process,
then bos started the fileserver and it started to salvage. Frozen command lines
got free at the moment the fileserver was killed. 
Access to the volumes was fine after salvaging.
The frequency of this hangup varies from once every 2 days to once a week.

I have made  corefile.fs and logs available under:
/afs/ethz.ch/etc/afs-logs/nethzafs-003     
if someone might want to have a look :^)

Thanks for any suggestions

Erwin




                                                         ''`'
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~O-O~~~~~~~
Erwin Broschinski               Tel:    +41 1 632 4281
Swiss Fed. Inst. of Technology  Fax:    +41 1 632 1225 
ETH Zentrum RZ/G8.1             E-Mail: broschi@id.ethz.ch
8092 Zurich                     PGP-key:  
Switzerland                     www.tik.ee.ethz.ch/~pgp/Search.html
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

"Ceterum censeo, 'Parvam Mollim' esse delendam."  (nach Cicero)