[OpenAFS] Craziness with cache, Input/Output error, Linux

Chas Williams (CONTRACTOR) chas@cmf.nrl.navy.mil
Mon, 21 Apr 2008 18:52:44 -0400


In message <480CDC3E.7090009@cs.wisc.edu>,David Thompson writes:
>> i suspect you will only see this bug if your filesystem containing the
>> cache is very close to full.
>
>We currently run with a cache set at boot time at 75% of the partition 
>size, and this has reduced the frequency of the problem to close enough 
>to zero for us.  At previous higher values (85% ??) we still saw this on 
>an infrequent but regular basis (across 100s of hosts).

the problem seems to be that afs gets ahead of the filesystem scavenger
thread that reclaims the blocks released by a delete.  perhaps afs should
smarter in the event of a write failure, like backing off and retrying
one more time.  if your cache is a seperate partition, you could try ext2.
the delete semantics on that filesystem are a bit different i think.