[OpenAFS] Re: Debugging opportunity (time-sensitive)

Andrew Deason adeason@sinenomine.net
Wed, 18 May 2011 15:42:14 -0500


On Wed, 18 May 2011 15:59:33 -0400
Jeff Blaine <jblaine@kickflop.net> wrote:

> >>     0        ->  afs_osi_Sleep
> >>     0         | afs_osi_Sleep:entry             event 705ac1bc = 1023, 1,
> >> 1, 1, 0, 0, 0, 2062683024, 2062683824, 0, 2062684288
> >
> > This is looking a little weird, but I'm not really used to looking at a
> > lock structure like this. Are you running a 32-bit kernel module?
> 
> bash-3.00# file /kernel/fs/sparcv9/afs
> /kernel/fs/sparcv9/afs: ELF 64-bit MSB relocatable SPARCV9 Version 1
> bash-3.00#

That doesn't say which one is currently loaded.

> > If you run that again, do these values change?
> 
> I ran it once just after receiving this email, and yes,
> it did "more stuff" then hung with a similar line.

Similar? Or the same?

> Now when I run it over and over, the trace shows the same
> ~25 lines as reported above, and hangs there as well.
> The values shown for afs_osi_Sleep:entry do not change.

Everything is _exactly_ the same? And these processes hang forever
without exiting?

One of the values there is supposed to record the number of threads
waiting for that lock. It must be changing on each waiter, or the
addresses we're examining are the wrong addresses, or we're not sleeping
due to a lock acquisition like I thought.

I don't know if anyone else has an idea on that.

-- 
Andrew Deason
adeason@sinenomine.net