[OpenAFS-devel] Re: [OpenAFS] Kernel panic, 1.4.2, Solaris 9

Derrick J Brashear shadow@dementia.org
Thu, 26 Apr 2007 17:17:52 -0400 (EDT)


On Thu, 26 Apr 2007, Kevin Hildebrand wrote:

>
> Hello, we just had one of our fileservers go into meltdown mode, and stop 
> serving data for a while.  During this time, several of our client machines 
> kernel paniced repeatedly with the messages below.  The machines
> continued the reboot/panic cycle until we got the fileserver back under 
> control.
>
> We have occasional meltdowns due to load or other unknown factors, but this 
> is the first time I've seen widespread client panics as a result.
>
> I have numerous crash dumps available if anyone wants to look at them or 
> wants me to provide additional data.
>
> Thanks,
>
> Kevin
>
> Apr 26 13:07:36 po0.wam.umd.edu afs: [ID 998965 kern.notice] getvolslot none
> Apr 26 13:07:36 po0.wam.umd.edu unix: [ID 836849 kern.notice]
> Apr 26 13:07:36 po0.wam.umd.edu ^Mpanic[cpu1]/thread=30004050800:
> Apr 26 13:07:36 po0.wam.umd.edu unix: [ID 998965 kern.notice] getvolslot none

This suggests you were getting "waiting for busy volume" on every volume, 
tying up every slot in the volumes list.

We could make this sleep instead of panicing or something, perhaps.

Moving to the devel list.

Thoughts?