[OpenAFS] fileserver crashes

John W. Sopko Jr. sopko@cs.unc.edu
Wed, 20 Oct 2004 14:14:54 -0400


The user that was having the problem upgraded to OpenAFS 1.3.73 and is still
having problems. He goes into his application, (Matlab), which allows you
to cd and list files from within the application. When he does this the
application just hangs. I turned up the FileLog debugging and the
system just streams requests like the following continuously until he kills 
the Matlab application:

---
Wed Oct 20 13:30:24 2004 SRXAFS_FetchData, Fid = 1769554818.372998.237859
Wed Oct 20 13:30:24 2004 SRXAFS_FetchData, Fid = 1769554818.372998.237859, 
Host 152.2.128.179, Id 5269
Wed Oct 20 13:30:24 2004 SRXAFS_FetchData returns 0
Wed Oct 20 13:30:24 2004 SAFS_FetchStatus,  Fid = 1769554818.372996.237858, 
Host 152.2.128.179, Id 5269
Wed Oct 20 13:30:24 2004 SAFS_FetchStatus returns 0
Wed Oct 20 13:30:24 2004 SRXAFS_FetchData, Fid = 1769554818.372996.237858
Wed Oct 20 13:30:24 2004 SRXAFS_FetchData, Fid = 1769554818.372996.237858, 
Host 152.2.128.179, Id 5269
Wed Oct 20 13:30:24 2004 SRXAFS_FetchData returns 0
Wed Oct 20 13:30:24 2004 SAFS_FetchStatus,  Fid = 1769554818.372994.237857, 
Host 152.2.128.179, Id 5269
Wed Oct 20 13:30:24 2004 SAFS_FetchStatus returns 0
Wed Oct 20 13:30:24 2004 SRXAFS_FetchData, Fid = 1769554818.372994.237857
Wed Oct 20 13:30:24 2004 SRXAFS_FetchData, Fid = 1769554818.372994.237857, 
Host 152.2.128.179, Id 5269
Wed Oct 20 13:30:24 2004 SRXAFS_FetchData returns 0
Wed Oct 20 13:30:24 2004 SAFS_FetchStatus,  Fid = 1769554818.372992.237856, 
Host 152.2.128.179, Id 5269
Wed Oct 20 13:30:24 2004 SAFS_FetchStatus returns 0
Wed Oct 20 13:30:24 2004 SRXAFS_FetchData, Fid = 1769554818.372992.237856
Wed Oct 20 13:30:24 2004 SRXAFS_FetchData, Fid = 1769554818.372992.237856, 
Host 152.2.128.179, Id 5269
Wed Oct 20 13:30:24 2004 SRXAFS_FetchData returns 0
---

He has the same problem from 2 different windows machines running
1.3.73. The directory he is listing does contain 1777 files.
We can see the files fine through a normal Window explorer type window and
I can list the directory from a 1.2.11 linux client without a problem.

I am more of a unix/linux guy but I have involved our windows folks.
If you can tell me how to use the windows debug client or get more 
information some other way I will be happy to try it out.
Thanks for your help.

John W. Sopko Jr. wrote:

> Derrick,
> 
> Jeff Altman pointed me to a windows beta client that I downloaded and
> will have our user test. Is the patch you refer to below a windows client
> patch or a linux server patch?
> 
> Jeff,
> 
> We are having an important site visit here the next 2 days and
> will not be able to test out the windows beta client until Monday.
> 
> Derrick J Brashear wrote:
> 
>> the Windows client problem you reference below will be fixed in 1.3.72
>> however, the underlying filesystem issue is almost certainly the same 
>> one other people are having, and i can give you a patch if you're 
>> willing to try it which will not fix the issue but may help us track it
>>
>> On Wed, 13 Oct 2004, John W. Sopko Jr. wrote:
>>
>>> Our linux/AFS 1.2.11 file server has been hanging the last few weeks.
>>> We have been upgrading machines to Windows XP SPII and OpenAFS 1.7.x
>>> over the last month or so. Here is one issue I found that was causing
>>> the problem:
>>>
>>> We have a user who uses a Windows application called Matlab for
>>> generating and processing hundreds of files in AFS space from a Windows
>>> XP machine. He was running OpenAFS 1.2.x client. His machine was 
>>> upgraded
>>> to Service pack II and OpenAFS 1.3.71. His Matlab application hangs in
>>> windows and our file server eventually melts down.
>>>
>>> I am not an expert at debugging AFS, let me know if you want me to try
>>> something. I cranked up the debug on the FileLog to 25. I could see his
>>> machine was constantly logging messages like this, (the user name really
>>> is debug):
>>>
>>> Wed Oct 13 12:15:47 2004 FindClient: authenticating connection: 
>>> authClass=2
>>> Wed Oct 13 12:15:47 2004 FindClient: rxkad conn:
>>> name=debug,inst=,cell=,exp=1097688735,kvno=8
>>
>>
>> _______________________________________________
>> OpenAFS-info mailing list
>> OpenAFS-info@openafs.org
>> https://lists.openafs.org/mailman/listinfo/openafs-info
> 
> 

-- 
John W. Sopko Jr.               University of North Carolina
email: sopko AT cs.unc.edu      Computer Science Dept., CB 3175
Phone: 919-962-1844             Sitterson Hall; Room 044
Fax:   919-962-1799             Chapel Hill, NC 27599-3175