[OpenAFS-devel] Linux 2.6.10, OpenAFS 1.3.80 hangs

Miles Davis miles@cs.stanford.edu
Wed, 6 Apr 2005 11:14:26 -0700


On Tue, Apr 05, 2005 at 03:29:16PM -0400, Chaskiel M Grundman wrote:
> I've seen something that may be similar. I was able to track it to pkmap 
> (high memory pte) exhaustion. If you can, do the following:
> 
> - make sure you are not running X
> - trigger the hang
> - hit Ctrl-ScrollLock on the console to get a stack trace of all processes.
> - look through the traces (syslogd and klogd tend to still be operating 
> properly, so this data should also be available in the logs if you reboot)
> 
> if you see lots of
> kmap_high+0x179/0x311
> copy_strings+0x115/0x1e6
> copy_strings_kernel+0x18/0x1e
> do_execve+0x10f/0x1e6
> 
> or anything else that terminates in kmap_high, then this is the problem you 
> have. I have been unable to track down any unbalanced kmap() call in afs, 
> but if this was not an afs-related problem, I would expect to be able to 
> find other reports on lkml or in redhat's bugzilla, but no such existed the 
> last time I looked.

Bingo, looks like it:

[<c0143665>] kmap_high+0xc6/0x19a
[<c011a515>] default_wake_function+0x0/0xc
[<c011a515>] default_wake_function+0x0/0xc
[<c015ad04>] copy_strings+0x11a/0x1ee
[<c015adf0>] copy_strings_kernel+0x18/0x1e
[<c015bfc2>] do_execve+0x114/0x1ed
[<c0102949>] sys_execve+0x2b/0x8a
[<c0103ccb>] syscall_call+0x7/0xb

Thanks for the tip. If there is anything else I can do to help track this 
down, let me know.

-- 
// Miles Davis - miles@cs.stanford.edu - http://www.cs.stanford.edu/~miles
// Computer Science Department - Computer Facilities
// Stanford University