[OpenAFS] Re: File creation delays

John W. Sopko Jr. sopko@cs.unc.edu
Wed, 17 Mar 2010 15:10:49 -0400


Andrew Deason wrote, On 3/17/2010 2:57 PM:
> On Wed, 17 Mar 2010 14:43:30 -0400
> "John W. Sopko Jr."<sopko@cs.unc.edu>  wrote:
>
>> Here is the strace command and the output of the FileLog trace, thanks
>> for your help. I did a "rmdir try" and the strace command complained
>> "No such device", if I "fs flushvolume" on the client it shows the
>> directory got removed else it shows it still there.
>>
>> Output from the "strace -tt rmdir try" command on the client host IP
>> 152.2.140.200:
>>
>> 14:25:39.851544 rmdir("try")            = -1 ENODEV (No such device)
>
> I'm not sure why you get an ENODEV, but for the moment I'm ignoring it
> to look at the delay problem instead... if anyone else has an intuition
> on that, feel free.
>
>> Here is the file server log "kill TSTP" 4 times during the same time,
>> the client IP is 152.2.140.200
>
> Trimming this down to the relevant thread...
>
>> Wed Mar 17 14:25:39 2010 [12] SAFS_RemoveDir    try,  Did =
>> 536884167.59.4927, Host 152.2.140.200:7001, Id 3903
>> Wed Mar 17 14:25:39 2010 [12] BCB: BreakCallBack(all but
>> 152.2.140.200:7001, (536884167,59,4927))
>> Wed Mar 17 14:25:47 2010 [12] Starting multibreakcall back on all addr
>> for host 152.2.140.115
>> Wed Mar 17 14:25:54 2010 [12] BCB: Failed on file 536884167.59.4927,
>> Host 152.2.140.115:7001 is down
>> Wed Mar 17 14:25:54 2010 [12] SAFS_RemoveDir    returns 0
>
> So, 152.2.140.115 looks like it's having trouble receiving callback
> breaks. Do you know what that host is? Does it perchance have some kind
> of firewall or anything that could prevent it from receiving incoming
> UDP packets on port 7001?

Hmmm, 152.2.140.115 is my Windows 7 desktop, I use an ssh Secure CRT
client to ssh from 152.2.140.115 to the various linux servers like
152.2.140.200. I am running the OpenAFS Windows client on my W7
desktop and it seems to work fine, copy, create, delete files. So
I assumed the firewall was open, I just did
"rxdebug 152.2.140.115 7001 -version" from the file server and
it hung. I will look into opening the firewall.

I would think that this would not matter since I am ssh'ing from
my 152.2.140.115 W7 machine to linux clients and they are having the problem.
I just logged into my linux desktop 152.2.140.200 and did a
mkdirt and got the delay problem, it is running openafs 1.4.12. From the
file server that I am accessing to the client having issues
rxdebug shows port 7001 is open:

% rxdebug lark.cs.unc.edu 7001 -version
Trying 152.2.140.200 (port 7001):
AFS version:  OpenAFS 1.4.12 built  2010-03-17





-- 
John W. Sopko Jr.               University of North Carolina
email: sopko AT cs.unc.edu      Computer Science Dept., CB 3175
Phone: 919-962-1844             Fred Brooks Building; Room 140
Fax:   919-962-1799             Chapel Hill, NC 27599-3175