[OpenAFS] Errors: Fileserver freezes, Volumes contains orphans

Rubino Geiß kb44@rz.uni-karlsruhe.de
Thu, 30 Jan 2003 21:55:01 +0100


From: Derrick J Brashear
> On Thu, 30 Jan 2003, [iso-8859-1] Rubino Geiß wrote:
> 
> > This morning one of our fileservers (OpenAFS 1.2.8, rh8.0) stopped 
> > serving files. Doing bos status, ping, rxdebug and looking 
> at the log 
> > files at most everything seemed to be ok. Only the BosLog showed 
> > constantly restarting file / salv processes.
> 
> I'll guess that this also was "main thread of fileserver died"

One more thing I remember: Doing a “ps –welf” shows that the fileserver
processes are waiting on "schedu" or "rt_sig", but this never changed as it
does in fully operational servers.

Note also: "bos shutdown" only killed some fileservers and all other
afs-server-processes besides the bosserver.

> 
> > Can anybody tell us how to get rid of these nasty features ;)
> 
> Not unless you can actually get us a core or something else 
> to work from.

But how can I get it? Is it really a memory problem? The Linux box itself
seemed to be fine. There were no kernel messages regarding our problem.

Even "kill -KILL" doesn’t work. So "kill -SEGV" doesn’t trigger the usual
core file production.

Bye, Ruby