[OpenAFS] Debugging AFS sluggishness on MacOS 10.5.7?

Samuel Bayer sam@mitre.org
Wed, 24 Jun 2009 14:27:03 -0400

Hi all -

I've suddenly encountered some awful sluggishness with OpenAFS on my 
Mac, and I'm hoping someone will be able to give me some pointers about 
how to track it down.

Here's the scoop: I've been using OpenAFS quite comfortably and happily 
for a number of years. The research computing support group at our 
corporation maintains an AFS cell behind our firewall, and while they 
don't support Macs, they have no objection to Mac clients. Yesterday, 
for the first time, I noticed that copying from the AFS cell was ungodly 
slow; a 70MB file was taking ~4 minutes to copy to local disk, while a 
similar copy on a supported Linux machine was taking between 5 and 30 
seconds. This is new behavior; as recently as a month ago, I'm certain 
that the copy time on the Mac was comparable.

Here's what I know:

- It's not a network problem. Copying the same file from an NFS server 
on the same subnet as the AFS server is comparable on my Mac and on a 
supported Linux machine, as is scp'ing the file from a supported Linux 
- It's not an OpenAFS version problem. I upgraded from 1.4.6 to 1.4.10 
with no resulting improvement in the performance. (I didn't do a clean 
install - and I'm going to try that - but I don't have much confidence 
that that will have any effect.)
- It's not a new server problem. My research computing admin tells me 
that nothing has changed in the AFS configuration in the last month 
except that they moved some volumes from one machine to another, but I 
can't believe that would make any difference, since they're both on the 
same subnet and neither of them is a machine directly listed in CellServDB.
- It MIGHT be a Mac OS problem. I upgraded to 10.5.7 about a month ago, 
and it's possible that the new behavior coincided with the install, 
although it's unlikely.
- It MIGHT be a network blocking issue of some sort. When I run the Unix 
"time" utility for the copy, it tells me that virtually no user or 
system time is expended; it's all clock time.

Where do I go from here? My admins don't know anything about Macs, and 
while I'm reasonably literate about them, I have no idea how to proceed.

Thanks in advance -
Sam Bayer
The MITRE Corporation

P.S. The problem I'm describing is a proxy for a more elaborate problem, 
but I'm confident that solving the problem I'm describing will solve the 
more complex problem as well.