[OpenAFS] fileserver runs constantly; stops answering
emoy@apple.com
emoy@apple.com
Fri, 7 Feb 2003 18:47:28 -0800
/usr/bin/sample
# sample 535 3 10
samples process 535 every 10 milliseconds for 3 seconds (300 samples
total).
------------------------------------------------------------------------
--
Edward Moy
Apple Computer, Inc.
emoy@apple.com
(This message is from me as a reader of this list, and not a statement
from Apple.)
On Friday, February 7, 2003, at 06:07 PM, Brent Johnson wrote:
> How did you create the call stack?
>
> emoy@apple.com wrote:
>
>> I've been pushing thousands of files to the file server (copying the
>> Mac OS X boot drive to AFS). All of a sudden, /afs goes away, and
>> the fileserver process is in constant run:
>>
>> 535 ?? R< 8:37.80 /usr/afs/bin/fileserver
>>
>> bos can't stop it, so I have to kill -INT it. This has happened
>> several times.
>>
>> Sampling the process, I get the following, though it complains that
>> the stack is in an inconsistent state, so it truncates the trace:
>>
>> Analysis of sampling pid 535 every 10 milliseconds
>> Call graph:
>> 300 Thread_0e03
>> 300 Create_Process_Part2
>> 300 rx_ListenerProc
>> 298 rxi_ListenerProc
>> 153 select
>> 153 select [STACK TOP]
>> 80 rxevent_RaiseEvents
>> 79 clock_UpdateTime
>> 77 getitimer
>> 77 getitimer [STACK TOP]
>> 2 clock_UpdateTime [STACK TOP]
>> 1 rxevent_RaiseEvents [STACK TOP]
>> 56 rxi_ReadPacket
>> 49 recvmsg
>> 49 recvmsg [STACK TOP]
>> 3 rxi_ReadPacket [STACK TOP]
>> 2 bzero
>> 2 bzero [STACK TOP]
>> 1 __error
>> 1 __error [STACK TOP]
>> 1 rxi_ReadPacket
>> 1 cerror
>> 1 cthread_set_errno_self
>> 1 cthread_set_errno_self [STACK TOP]
>> 3 __eprintf
>> 3 __eprintf [STACK TOP]
>> 3 rxi_RestoreDataBufs
>> 3 rxi_RestoreDataBufs [STACK TOP]
>> 2 rxi_ListenerProc [STACK TOP]
>> 1 recvmsg
>> 1 recvmsg
>> 1 recvmsg [STACK TOP]
>> 2 rxevent_RaiseEvents
>> 2 rxevent_RaiseEvents [STACK TOP]
>>
>> Total number in stack (recursive counted multiple, when >=5):
>>
>> Sort by top of stack, same collapsed (when >= 5):
>> select [STACK TOP] 153
>> getitimer [STACK TOP] 77
>> recvmsg [STACK TOP] 50
>>
>> Anyone seen this and/or know what it is about? The server is Mac OS
>> X running OpenAFS 1.2.8, with various patches, though nothing in the
>> lwp area, where the sample is indicating.