[OpenAFS] Windows cache rehashed...
Rodney M Dyer
rmdyer@uncc.edu
Thu, 18 Dec 2003 18:53:27 -0500
Jeffrey and others,
Today I've found a way to easily reproduce the bug in the AFS Windows cache
manager. It shows up rather easily as a leak in the handle
management. The number of handles rises out of control as files are being
copied from AFS to the local disk. After the number of handles has risen
beyond what is expected, if you run an application from AFS, then the
startup time will take much longer than normal. For example, our ProE
application starts up in 40 seconds avg. starting with an empty 8192 Meg
cache, but after the bug is reproduced, the time climbs to over 2 minutes.
To reproduce the problem, use the following settings...
Windows XP SP1, 1 Gig RAM, P4 3.0 Gig, 100 MBit connectivity
OpenAFS 1.2.10
Cache size: 8192
Chunk size: 32K
Status Entries: 1000
Background Threads: 6
Service Threads: 8
* Make a temporary directory to copy some files to...
c:\>mkdir "c:\temp\test"
* Change into the temporary folder...
c:\>cd "c:\temp\test"
* Make sure you start with a fresh cache...
c:\>net stop "IBM AFS Client"
c:\>del "c:\afscache"
Note: It may take some time here before the AFS service let's
go of the cache, keep trying the delete until the file is gone. (I'm not
sure why it takes so long sometimes for AFS to shutdown. It's probably the
same problem that manifests the handle leak.)
c:\>net start "IBM AFS Client"
* Now bring up the task manager and select the columns for
"afsd_service.exe" handles, etc., using the view->select columns menu.
* Now, in the default temporary directory at the command prompt, start a
recursive copy of a large tree of files out of your cells AFS space...
c:\temp\test>xcopy
"\\%computername%-afs\all\cell\dir1\dir2\dir3..." /s /e /f /c
The "/s /e /f /c" means...all subdirectories, even empty ones, show
the files as they are being copied, and continue on errors.
Any directory will do. You may need to copy a large number of files
and/or some big files. I just started the copy on a very large tree and
let it go. When the handles started rising, I just pressed CTRL+C, or
CTRL+Break. Depending on your AFS permissions, you may need a token to do
the copy. Make sure the size of the files being copied are plenty larger
than the cache size of 8192 Meg.
Now, if you watch the Task Manager's "afsd_service.exe" handle count it
will start out ok, but soon rise out of control. Stopping the copy has no
effect of reducing the handles.
That is about it.
Rodney