[OpenAFS] Debugging AFS sluggishness on MacOS 10.5.7?

Samuel Bayer sam@mitre.org
Mon, 29 Jun 2009 11:10:53 -0400


Derrick Brashear wrote:
> 
> On Wed, Jun 24, 2009 at 2:27 PM, Samuel Bayer <sam@mitre.org<mailto:sam@mitre.org>> wrote:
> Hi all -
> 
> I've suddenly encountered some awful sluggishness with OpenAFS on my Mac, and I'm hoping someone will be able to give me some pointers about how to track it down.
> 
> Here's the scoop: I've been using OpenAFS quite comfortably and happily for a number of years. The research computing support group at our corporation maintains an AFS cell behind our firewall, and while they don't support Macs, they have no objection to Mac clients. Yesterday, for the first time, I noticed that copying from the AFS cell was ungodly slow; a 70MB file was taking ~4 minutes to copy to local disk, while a similar copy on a supported Linux machine was taking between 5 and 30 seconds. This is new behavior; as recently as a month ago, I'm certain that the copy time on the Mac was comparable.
> 
> Here's what I know:
> 
> - It's not a network problem. Copying the same file from an NFS server on the same subnet as the AFS server is comparable on my Mac and on a supported Linux machine, as is scp'ing the file from a supported Linux machine.
> 
> Is NFS using TCP in your environment? If it is, you haven't ruled out network yet.

NFS is using TCP, but, as my local sysadmin suggested, I mounted the 
same NFS share using the UDP protocol, and the performance is equivalent 
to the TCP mount. So it's not UDP per se.

> - It's not an OpenAFS version problem. I upgraded from 1.4.6 to 1.4.10 with no resulting improvement in the performance. (I didn't do a clean install - and I'm going to try that - but I don't have much confidence that that will have any effect.)
> 
> I doubt it will, though what's in /var/db/openafs/etc/config/afsd.options?

I did the clean install, and the problem persists.

I don't have an afsd.options file, but I do have afs.conf. Here's the 
OPTIONS value:

OPTIONS="-afsdb -stat 2000 -dcache 800 -daemons 3 -volumes 70 -dynroot 
-fakestat -all"

It's most likely the default.

> - It's not a new server problem. My research computing admin tells me that nothing has changed in the AFS configuration in the last month except that they moved some volumes from one machine to another, but I can't believe that would make any difference, since they're both on the same subnet and neither of them is a machine directly listed in CellServDB.
> - It MIGHT be a Mac OS problem. I upgraded to 10.5.7 about a month ago, and it's possible that the new behavior coincided with the install, although it's unlikely.
> 
> from what?

Upgraded from 10.5.6.

Another tidbit that is either hopelessly banal because I don't 
understand how AFS works, or really important: WRITING to the same AFS 
directory works fine. I copy the file from the directory to my local 
disk and it takes 4 minutes; I delete the file from the remote disk and 
write it there from my local disk and it takes 30 seconds.

Thanks in advance -
Sam Bayer
The MITRE Corporation
sam@mitre.org