[OpenAFS] Re: interpreting cmdebug output of locked entries

Andrew Deason adeason@sinenomine.net
Wed, 2 Nov 2011 15:59:19 -0500


On Wed, 2 Nov 2011 11:04:59 -0700
Jonathan Nilsson <jnilsson@uci.edu> wrote:

> Lock afs_xdcache status: (writer_waiting, write_locked(pid:3700 at:617), 1 waiters)
> ** Cache entry @ 0xea61a4c0 for 2.536870959.1.1 [ss2k.uci.edu]
>     locks: (none_waiting, 1 read_locks(pid:22600))
>             2048 bytes  DV           57  refcnt     2
>     callback 00000000 expires 1320206924
>     0 opens 0 writers
>     volume root
>     states (0x4), read-only
> 
> Unfortunately, I do not have a process list, so I don't know what pid
> 3700 is, but most likely it is httpd I assume.

pid 3700 is actually probably just an afsd daemon. How long does it stay
like this? at:617 looks like it's just part of the process when a
background daemon is writing out dirty cache entries to disk. It should
not take very long, and we only do that about once an hour.

If you could alt-sysrq-t on the console, you may get a listing of
process kernel backtraces logged, which would be helpful. Whether or not
you actually get one depends on how wedged the machine is, of course.

> Now, trying to determine the file that this cache entry refers to,

It's fid 536870959.1.1, which is the root directory for volume
536870959. However, nobody is waiting for the lock on that cache entry,
so it's not causing the hang. (It may be what is hang_ing_, however, and
I would assume pid 22600 may be an httpd process)

-- 
Andrew Deason
adeason@sinenomine.net