[OpenAFS] Windows cache rehashed...

Rodney M Dyer rmdyer@uncc.edu
Fri, 19 Dec 2003 11:00:57 -0500


Jeffrey,

Ah, tedious to the end eh?  No, you are incorrect.  The only error made 
here is that the units that I used in my previous email where 
wrong.  Everywhere I used "Meg", I should have said "K".  Using that 
reasoning...

8192 = 8.192 Meg

So,

8192K / 32K = 256 Handles...Not many at all.

So please explain why the handle count rises into the multiple thousands?

If you had even tried to look at my mail a little closer you should have 
realized my mistake.  An 8192 Meg cache would be 8 Gig.  It should be 
obvious to anyone who has used the Windows version that you can't make a 
cache over 2 Gig in size.  So I was obviously intending the 8 Meg 
cache.  You could have just asked.  We've only been discussing this for a 
couple weeks now, you should have seen that I was meaning a smaller cache 
than normal.

The bug still stands, unless you can reason a way out of this...

Rodney







At 07:01 PM 12/18/2003, Jeffrey Altman wrote:
>Rodney:
>
>The handle count will go down over time after the pages have not been
>touched for a while.
>In order for this to happen you must stop accessing the AFS file
>system.  The number of
>handles which are allocated is slightly above
>
>   <total cache memory> / <cache block size>
>
>If you use a large amount of memory and a small block size this number
>can be exceedingly
>large.
>
>There is no bug here, it is simply an extremely poor design for the
>cache sizes which are
>desired.  The cache manager needs to be replaced.  The existing
>algorithm simply results
>in huge quantities of thrashing when the cache is filled.
>
>Jeffrey Altman
>
>
>Rodney M Dyer wrote:
>
> > Jeffrey and others,
> >
> > Today I've found a way to easily reproduce the bug in the AFS Windows
> > cache manager.  It shows up rather easily as a leak in the handle
> > management.  The number of handles rises out of control as files are
> > being copied from AFS to the local disk.  After the number of handles
> > has risen beyond what is expected, if you run an application from AFS,
>
> > then the startup time will take much longer than normal.  For example,
>
> > our ProE application starts up in 40 seconds avg. starting with an
> > empty 8192 Meg cache, but after the bug is reproduced, the time climbs
>
> > to over 2 minutes.
> >