[OpenAFS-port-darwin] Re: OS X hangs when accessing files
Systems Administration
sysadmin@contrailservices.com
Mon, 9 Aug 2004 11:55:01 -0600
>> No - the mac client is hung indefinitely - at least I have not had
>> the patience to wait it out - 60 minutes is my limit to have my
>> workstation be useless.
Hmm - correction on this - it seems that this hang is not related to
permissions as I had thought - it definitely hangs even in file spaces
that are accessible to my administratively privileged account. In this
example I am untarring an archive up to the server and it hangs after
about half of the tarball has been uploaded.
The only way to kick to a 'timeout' is to force restart the bosserver
on the fileserver box.
This happens with two different fileservers now, one is the cell master
and KRB5 KDC, and also an AFS fileserver, the other is just a AFS
fileserver running only the fs processes. All of the traffic is
between the client mac and the fileserver hosting the disk - restarting
the cell master has no effect, only restarting the fileserver causes
the client timeout and release of the system hang.
>
> tcpdump port 7000; does anything show up?
> cmdebug (hung client hostname); do you see any locks held?
cmdebug shows one lock on the file while the client is hung - cannot
kill the process.
** Cache entry @ 0x0d75d968 for 1.536871048.193.860
[ridgebacksystems.com]
locks: (none_waiting, upgrade_locked(pid:1175 at:66))
2048 bytes DV 26 refcnt 25
callback 0281ef40 expires 1092087635
0 opens 0 writers
normal file
states (0x1), stat'd
Tcp dump shows:
11:39:36.855104 IP
lightning.internal.contrailservices.com.afs3-callback >
turbine.internal.contrailservices.com.afs3-fileserver: rx data fs call
fetch-data fid 536871048/193/860 offset 0 length 999999999 (52)
11:39:37.238583 IP
lightning.internal.contrailservices.com.afs3-callback >
turbine.internal.contrailservices.com.afs3-fileserver: rx data fs call
fetch-data fid 536871048/193/860 offset 0 length 999999999 (52)
11:39:37.238858 IP
turbine.internal.contrailservices.com.afs3-fileserver >
lightning.internal.contrailservices.com.afs3-callback: rx ack first 2
serial 1754 reason duplicate packet (65)
11:39:51.890754 IP
turbine.internal.contrailservices.com.afs3-fileserver >
lightning.internal.contrailservices.com.afs3-callback: rx ack first 2
serial 0 reason ping (65)
11:39:51.891032 IP
lightning.internal.contrailservices.com.afs3-callback >
turbine.internal.contrailservices.com.afs3-fileserver: rx ack first 1
serial 890 reason ping response (65)
11:40:01.254868 IP
lightning.internal.contrailservices.com.afs3-callback >
turbine.internal.contrailservices.com.afs3-fileserver: rx ack first 1
serial 0 reason ping (65)
11:40:01.255304 IP
turbine.internal.contrailservices.com.afs3-fileserver >
lightning.internal.contrailservices.com.afs3-callback: rx ack first 2
serial 1756 reason ping response (65)
11:40:16.942052 IP
turbine.internal.contrailservices.com.afs3-fileserver >
lightning.internal.contrailservices.com.afs3-callback: rx ack first 2
serial 0 reason ping (65)
11:40:16.942344 IP
lightning.internal.contrailservices.com.afs3-callback >
turbine.internal.contrailservices.com.afs3-fileserver: rx ack first 1
serial 895 reason ping response (65)
11:40:25.259102 IP
lightning.internal.contrailservices.com.afs3-callback >
turbine.internal.contrailservices.com.afs3-fileserver: rx ack first 1
serial 0 reason ping (65)
11:40:25.259370 IP
turbine.internal.contrailservices.com.afs3-fileserver >
lightning.internal.contrailservices.com.afs3-callback: rx ack first 2
serial 1758 reason ping response (65)
11:40:36.983144 IP
turbine.internal.contrailservices.com.afs3-fileserver >
lightning.internal.contrailservices.com.afs3-callback: rx ack first 2
serial 0 reason ping (65)
11:40:36.983418 IP
lightning.internal.contrailservices.com.afs3-callback >
turbine.internal.contrailservices.com.afs3-fileserver: rx ack first 1
serial 899 reason ping response (65)
Ted