[OpenAFS] Re: deadlock in OpenAFS 1.4.11 (Solaris 5.10)

Andrew Deason adeason@sinenomine.net
Fri, 9 Apr 2010 14:45:32 -0500


On Fri, 9 Apr 2010 14:48:34 -0400
Derrick Brashear <shadow@gmail.com> wrote:

> > What about the kernel stack trace for the proc(s) from mdb? Or do you
> > know where we're hanging?
> 
> nope. i figured fstrace would make it easier to guess that but jumping
> directly to a stack trace is probabyl a fine course of action.

John, if you want to do this, do the following for each PID listed in
that cmdebug output:

("$pid", "ffffffffaddress1", and "ffffffffaddress2" etc are placeholders)

# mdb -k
> 0t$pid::pid2proc | ::threadlist
            ADDR             PROC              LWP CMD/LWPID
ffffffffaddress1 ffffffffaddress2                0 XXX/YYY
> ffffffffaddress2::findstack

So, as an example, looking at process 674:

> 0t674::pid2proc | ::threadlist
            ADDR             PROC              LWP CMD/LWPID
ffffffff83e74908 ffffffff832de020                0    /239
> ffffffff832de020::findstack
[stack trace]

(make sure you don't see anything sensitive in there, though I don't
think there would be)

fstrace may give more information on how we came to that point, but this
should tell us why someone is hanging with the lock we're waiting for...

-- 
Andrew Deason
adeason@sinenomine.net