[OpenAFS] Problems with NAT & extremely slow fileserver
giovanni bracco
bracco@frascati.enea.it
Tue, 8 Feb 2005 16:45:57 +0100
In my institution we run an AFS cell where some of the fileservers are OpenAFS
and others (most of them) are Transarc AFS
Every now and then ( once a month or less ) it happens that one of our
fileservers becomes very slow and using
rxdebug $servername 7000 -rxstats
it can be seen that the server has 9 connections to the SAME client which
blocks the activity:
Tue Feb 8 14:33:22 NFT 2005 waiting_for_process wp=00009_res=01287_ig=25802
1 192.107.51.29 Port=1434_id=8bb6c9ac/8162d80_R=2288_S=28124
2 192.107.51.29 Port=1434_id=8bb6c9ac/8162d84_R=2288_S=28124
3 192.107.51.29 Port=1434_id=8bb6c9ac/8162d88_R=2288_S=28124
4 192.107.51.29 Port=1434_id=8bb6c9ac/8162d90_R=2288_S=28124
5 192.107.51.29 Port=1434_id=8bb6c9ac/8162d94_R=2288_S=28124
6 192.107.51.29 Port=1434_id=8bb6c9ac/8162d98_R=2288_S=28124
7 192.107.51.29 Port=1434_id=8bb6c9ac/8162da0_R=2288_S=28124
8 192.107.51.29 Port=1434_id=8bb6c9ac/8162da4_R=2288_S=28124
9 192.107.51.29 Port=1434_id=8bb6c9ac/8162da8_R=2288_S=28124
The client usually is an OpenAFS WIndows Client behind NAT.
(it happens also with recent 1.3.x versions)
We observe it for sure on Transarc AFS fileserver. Today case is a Solaris
with Transarc AFS 3.6 2.32.
The only way to end the problem is to disconnect completely the client.
If the file server is just restarted using bos, the problem arises again in a
short time.
When the problem arises the following messages are found (3-4 times each
minute) in the FileLog:
..
Tue Feb 8 07:57:36 2005 CB: RCallBackConnectBack failed for c06b331d.1434
Tue Feb 8 07:58:32 2005 CB: Call back connect back failed (in break delayed)
for c06b331d.1434
Tue Feb 8 07:58:32 2005 BreakDelayedCallbacks FAILED for host c06b331d which
IS UP. Possible network or routing failure.
...
where c06b331d.1434 is the same address as the one obtained from rxdebug,
192.107.51.29
Looking on the web using the keyword BreakDelayedCallbacks I have found a 2001
posting:
https://lists.openafs.org/pipermail/openafs-devel/2001-March/005683.html
which seems connected with the "BreakDelayedCallbacks" error message and
suggesting a patch for OpenAFS.
Actually I have tried to describe the problem, but I do not understand why it
arises seldomly and only with NAT clients.
The question:
has this kind of problem been solved in the current version of OpenAFS and
the solution is to migrate to OpenAFS all our file server?
Any suggestion or explanation is well accepted!
Giovanni
--
Giovanni Bracco
ENEA INFO
(Servizio Informatica e Reti)
Via E. Fermi 45
I-00044 Frascati (Roma) Italy
phone 00-39-06-9400-5597
FAX 00-39-06-9400-5735
E-mail bracco@frascati.enea.it
WWW http://fusfis.frascati.enea.it/~bracco