[OpenAFS] File creation delays

John W. Sopko Jr. sopko@cs.unc.edu
Wed, 17 Mar 2010 08:03:37 -0400


We are having intermittent file creation delays. For example if you make
a directory or vim a file and close the file, it takes 15 seconds or so
to create the file. The problems seems to be getting worse. The problem
goes away and then come back, I have seen this problem on and off over the 
last year or so.

I see the problem on several clients to different file servers.
The clients and file servers are all Red Hat 5.5 systems running OpenAFS
1.4.11. I upgraded one of the clients to 1.4.12 and that did not help. I
will be upgrading the servers sometime soon.

I am using the default medium file server arguments. I have recently
rebooted our file and db servers.

I did the test during a pretty quiescent time, the file servers do not
appear to be loaded. I do not know a way to debug or get more info to
troubleshoot this problem?

Below is an example of creating a directory in AFS that took 15
seconds. The removal and next mkdir work fine. Below that is strace
output of the mkdir command and where it is hanging.

% date; mkdir try; date
Wed Mar 17 06:37:18 EDT 2010
Wed Mar 17 06:37:33 EDT 2010

% date; rmdir try; date
Wed Mar 17 06:37:45 EDT 2010
Wed Mar 17 06:37:45 EDT 2010

% date ; mkdir try ; date
Wed Mar 17 06:37:57 EDT 2010
Wed Mar 17 06:37:57 EDT 2010


Output from strace, last few lines:

% strace mkdir try

open("/proc/mounts", O_RDONLY)          = 3
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x2ae00d5d2000
read(3, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 892
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x2ae00d5d2000, 4096)            = 0
open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=56442464, ...}) = 0
mmap(NULL, 56442464, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2ae00d5ff000
close(3)                                = 0
mkdir("try", 0777)                      = 0

----15 second delay happens here


close(1)                                = 0
exit_group(0)                           = ?


-- 
John W. Sopko Jr.               University of North Carolina
email: sopko AT cs.unc.edu      Computer Science Dept., CB 3175
Phone: 919-962-1844             Fred Brooks Building; Room 140
Fax:   919-962-1799             Chapel Hill, NC 27599-3175