[OpenAFS] Strange caching failures

Stephan Wonczak a0033@rrz.uni-koeln.de
Fri, 1 Mar 2013 10:19:03 +0100 (CET)


   Hi Russ!

On Thu, 28 Feb 2013, Russ Allbery wrote:

> Stephan Wonczak <a0033@rrz.uni-koeln.de> writes:
>
>>   for the past few weeks, we are struck with a very weird behavior
>> regarding cache updates of AFS clients. It looks like sometimes the
>> callback does not work and one client is stuck with an older version of
>> the file in question. Example:
>
>>   Write to file 'foo' on client A every five minutes.
>>   Clients B,C and D dutifully update their caches and see the updates
>>   After some time, suddenly Client B dows not see the updates any more,
>> while clients C and D continue working fine.
>
> When this happens, does the write to the file on client A block for a few
> seconds?

   Very difficult to say since the behavior is so non-deterministic.
   What my colleague did was to write a cronjob on one machine (client 
version 1.4.11) to write to a short status file every five minutes, and 
subsequently do a 'ls -l' on several other clients (both 1.4 and 1.6).
   So far it *looks* like it is only the 1.6.x-clients that stop updating, 
but for this specific file it takes between 4 and 12 hours for the effect 
to show up.

 	Dipl. Chem. Dr. Stephan Wonczak

         Regionales Rechenzentrum der Universitaet zu Koeln (RRZK)
         Universitaet zu Koeln, Weyertal 121, 50931 Koeln
         Tel: +49/(0)221/470-89583, Fax: +49/(0)221/470-89625