[OpenAFS] Re: OpenAFS client cache overrun?

Andrew Deason adeason@sinenomine.net
Fri, 14 Mar 2014 15:43:55 -0500


On Fri, 14 Mar 2014 16:35:44 -0400
Eric Chris Garrison <ecgarris@iu.edu> wrote:

> >>[root@rgwb1 ~]# !ps
> >>ps -ef | grep 29278
> >>root     29278  4477  0 09:27 ?        00:00:00 smbd
> >>root     30101 29337  0 09:37 pts/3    00:00:00 grep 29278
> >>When I ran "top" I saw that the afs_cachetrim process was #1, but
> >>presumably wedged.
> >>I goosed /proc/sysrq-trigger and as promised, it dumped a lot of call
> >>trace info to the syslog. I'm looking through it, but am not sure what to
> >>look for. Nothing stands out, anyway.
> >
> >You're looking for the stack trace for the afs_cachetrim process. Look
> >in syslog for "afs_cachetrim", or its pid. Under that should be a trace
> >of functions that indicates where we are in the code at that time.
> >
> >I would extract that, and the entry for a hanging process. So, maybe
> >29278, or if anything hangs when touching anything in /afs, you could
> >get the entry for that.
> 
> Oddly, there's nothing for afs_cachetrim.

...but are you seeing kernel stack traces for other processes? It's not
completely clear to me from the output you have posted. It may not be in
the normal syslog log files you're used to looking at, depending on your
syslog configuration ('dmesg' would definitely have them, but it
probably will trim some content, since it probably won't have space for
everything).

You definitely have an afs_cachetrim process, according to what you said
earlier (you saw it in 'top'). Not having one would definitely explain a
lockup, but it doesn't seem to be "missing"...

> >Or if you want to try to find "everything", just look for anything
> >containing the string "afs".
> 
> I get just this kind of message during the last lockup:

I meant, any stack traces containing 'afs'. Surely you have some, if the
afs client is running at all.

-- 
Andrew Deason
adeason@sinenomine.net