[OpenAFS] Re: Bug in OpenAFS 1.4.12?

Andrew Deason adeason@sinenomine.net
Wed, 27 Oct 2010 12:33:55 -0500


On Tue, 12 Oct 2010 10:00:12 +0200
Claudio Prono <claudio.prono@atpss.net> wrote:

> Oct 12 05:07:46 kerberos kernel: [1537890.352252] EIP is at
> afs_CacheTruncateDaemon+0x366/0x4c0 [libafs]
[...] 
> And then, nothing more until the system was rebooted manually....
> 
> What can be the problem? AFS bug?

I've been meaning to look at this... yeah, this is probably us. A wild
guess would be that the dcache discard list is screwed up and is causing
us to loop endlessly near the beginning of the afs_CacheTruncateDaemon
loop to clear out the discard list.

If you still happen to have the exact same kernel and OpenAFS module, it
would help to get information about where we are in
afs_CacheTruncateDaemon. At the very least disassembling the function
would help.

I haven't done this in OpenSuSE before, but I think the tools are
similar as in RHEL. If you have crashtool and the debuginfo package for
your running kernel installed, you should be able to just run 'crash'
and then 'dis afs_CacheTruncateDaemon' to get the asm.

-- 
Andrew Deason
adeason@sinenomine.net