[OpenAFS] Re: Strange caching failures

Andrew Deason adeason@sinenomine.net
Fri, 8 Mar 2013 09:53:37 -0600

On Fri, 8 Mar 2013 11:05:44 +0100
Stephan Wonczak <wonczak@uni-koeln.de> wrote:

>    My colleagues did exactly that - thanks for the tip!
>    The generated dump is available for download at
> http://www.uni-koeln.de/~a0033/afs_fstrace.tgz

Well, this only shows about 6 seconds of activity around 09:12. We
probably need to capture information at the time the "missed change"
occurred. From the given logs, I assume that Zeitstempel represents the
time the file was updated, and all of the Ausgabe's represents the mtime
of the file on each of the other clients. So in that case, the capture
time is not close to the time the file changed, so it's probably not
very useful.

So, in order to capture that, you either need to run the 'fstrace dump'
within a few seconds of the missed update, or run the "continuous
capture" alternative I listed earlier.

We also will need information about the file you're looking at. Probably
FID, file length (expected vs actual, if they differ) and filename. It
would also be helpful to know the pid of the process that's reading the
file, and it would help to know exactly how you are reading the file.
(e.g. via 'cat', or just looking at the mtime with 'ls'/'stat')

Also, are you only running this experiment on your actual webservers? If
you had a client that was running nothing but this test, it would make
it a bit easier to look through, and it would make it less likely that
we'd drop trace information due to running out of buffer space.

Andrew Deason