[OpenAFS] Re: File creation delays
John W. Sopko Jr.
sopko@cs.unc.edu
Wed, 17 Mar 2010 15:10:49 -0400
Andrew Deason wrote, On 3/17/2010 2:57 PM:
> On Wed, 17 Mar 2010 14:43:30 -0400
> "John W. Sopko Jr."<sopko@cs.unc.edu> wrote:
>
>> Here is the strace command and the output of the FileLog trace, thanks
>> for your help. I did a "rmdir try" and the strace command complained
>> "No such device", if I "fs flushvolume" on the client it shows the
>> directory got removed else it shows it still there.
>>
>> Output from the "strace -tt rmdir try" command on the client host IP
>> 152.2.140.200:
>>
>> 14:25:39.851544 rmdir("try") = -1 ENODEV (No such device)
>
> I'm not sure why you get an ENODEV, but for the moment I'm ignoring it
> to look at the delay problem instead... if anyone else has an intuition
> on that, feel free.
>
>> Here is the file server log "kill TSTP" 4 times during the same time,
>> the client IP is 152.2.140.200
>
> Trimming this down to the relevant thread...
>
>> Wed Mar 17 14:25:39 2010 [12] SAFS_RemoveDir try, Did =
>> 536884167.59.4927, Host 152.2.140.200:7001, Id 3903
>> Wed Mar 17 14:25:39 2010 [12] BCB: BreakCallBack(all but
>> 152.2.140.200:7001, (536884167,59,4927))
>> Wed Mar 17 14:25:47 2010 [12] Starting multibreakcall back on all addr
>> for host 152.2.140.115
>> Wed Mar 17 14:25:54 2010 [12] BCB: Failed on file 536884167.59.4927,
>> Host 152.2.140.115:7001 is down
>> Wed Mar 17 14:25:54 2010 [12] SAFS_RemoveDir returns 0
>
> So, 152.2.140.115 looks like it's having trouble receiving callback
> breaks. Do you know what that host is? Does it perchance have some kind
> of firewall or anything that could prevent it from receiving incoming
> UDP packets on port 7001?
Hmmm, 152.2.140.115 is my Windows 7 desktop, I use an ssh Secure CRT
client to ssh from 152.2.140.115 to the various linux servers like
152.2.140.200. I am running the OpenAFS Windows client on my W7
desktop and it seems to work fine, copy, create, delete files. So
I assumed the firewall was open, I just did
"rxdebug 152.2.140.115 7001 -version" from the file server and
it hung. I will look into opening the firewall.
I would think that this would not matter since I am ssh'ing from
my 152.2.140.115 W7 machine to linux clients and they are having the problem.
I just logged into my linux desktop 152.2.140.200 and did a
mkdirt and got the delay problem, it is running openafs 1.4.12. From the
file server that I am accessing to the client having issues
rxdebug shows port 7001 is open:
% rxdebug lark.cs.unc.edu 7001 -version
Trying 152.2.140.200 (port 7001):
AFS version: OpenAFS 1.4.12 built 2010-03-17
--
John W. Sopko Jr. University of North Carolina
email: sopko AT cs.unc.edu Computer Science Dept., CB 3175
Phone: 919-962-1844 Fred Brooks Building; Room 140
Fax: 919-962-1799 Chapel Hill, NC 27599-3175