[OpenAFS-devel] Kernel 2.4.0 Oops when afs is shutting down

Daniel Jacobowitz dmj+afs@andrew.cmu.edu
Mon, 15 Jan 2001 13:52:10 -0500


On Mon, Jan 15, 2001 at 04:43:09PM +0100, Michael Pronath wrote:
>  After applying the patch posted recently by Chas Williams (9 Jan 2001), I
>  am able to compile and run the AFS client on my 2.4.0 kernel. However,
>  when AFS is unmounted, it oopses.  I thought, it could be the new
>  waitqueues and therefore, I added some afs_warn's to show where it happens
>  and #define'd WAITQUEUE_DEBUG 1  in include/linux/wait.h.
> 
>  AFAIK, the oops happens during the call to interruptible_sleep_on in
>  afs_osi_Sleep.  But as it happens in the kernel socket code, it may
>  well be a problem with releasing the RxListener socket in osi_StopListener.

It's the later, actually.  Manually scheduling a timeout there allows a
single unmount (usually) to succeed (although a second often oopses
anyway).

You're running SMP, right?

What happens, as best I can tell, is that we signal the listener, which
is sitting in wait_for_packet, and then we close the socket while still
in wait_for_packet, which oopses because we cleared its locked
structure out from under it.  I couldn't figure out the appropriate
lock we should be holding; we may need to invent a new one.

Dan

/--------------------------------\  /--------------------------------\
|       Daniel Jacobowitz        |__|        SCS Class of 2002       |
|   Debian GNU/Linux Developer    __    Carnegie Mellon University   |
|         dan@debian.org         |  |       dmj+@andrew.cmu.edu      |
\--------------------------------/  \--------------------------------/