[OpenAFS] Re: Solaris 10 deadlock issue

Benjamin Kaduk kaduk@MIT.EDU
Fri, 17 Jun 2011 13:01:33 -0400 (EDT)


Jumping in rather late ...

On Wed, 15 Jun 2011, Andrew Deason wrote:

> On Tue, 14 Jun 2011 22:39:55 -0500
> Andrew Deason <adeason@sinenomine.net> wrote:
>
>> echo "::walk thread | ::findstack" | mdb -k unix.N vmcore.N > foo.out
>>
>> Then ideally edit foo.out and remove anything that doesn't mention
>> "afs" in the stack trace. But if this is as easily reproducible as it
>> looks, then we can probably get our own soon enough.
>
> I talked about this a bit at the Workshop today, but so it's here... I
> do have this replicated locally, and I sorta know what the problem is.
> Something has changed with out mmaped data is retrieved, and our local

This issue sounds rather similar (superficially, at least) to one we've 
been seeing on FreeBSD clients.  When you say that "something has changed 
...", is that something you think is OS-specific AFS code, OS code, or 
generic AFS code?

Thanks,

Ben

> bookkeeping on dcache entries is as a result preventing us from kicking
> out the dcache entries for the file you're reading when the cache gets
> too full.
>