[OpenAFS-devel] reproducible problem during cache flush

Nickolai Zeldovich kolya@MIT.EDU
Tue, 30 Jul 2002 13:10:24 -0400


I was also able to reproduce the problem on my machine (2.4.18, SMP,
even though I don't think SMP matters here).  The problem is, as we
already suspected, the small cache.  The writing process is hung in
afs_UFSWrite in afs_osi_Sleep(&afs_WaitForCacheDrain).  The cache
truncate daemon is unable to clean up anything because all the
dcaches are either IFFree (on the free list) or IFDataMod (have
unsaved data), and keeps looping.

It looks like the old 2.2 Linux code called afs_DoPartialWrite to
flush modified chunks if we were running low on cache space, but
the new 2.4 VM-integrated write routines, that use generic_file_write,
don't call afs_DoPartialWrite anywhere.  This is likely the problem.

-- kolya