[OpenAFS] ProbeUuid failed for host xxx.xxx.xxx.xxx:7001
Lester Barrows
barrows@email.arc.nasa.gov
Wed, 30 Mar 2005 11:56:01 -0800
It seems I was barking up the wrong tree with the previous error, which
confused the issue. The ProbeUuid error may have more to do with the problem.
Perhaps a better, more complete description (with ideally no ambiguity) is in
order.
- Occasionally when many small files are transferred quickly onto a volume,
the server containing the volume will time out on one or more clients. These
clients will no longer be able to access the server.
- A "Connection timed out" error is shown in a terminal session on an affected
client when attempting to access a volume from the affected server, which has
now become inaccessable.
- When a client can no longer access the affected server, the following entry
comes up for the client system in the affected server's FileLog:
ProbeUuid failed for host xxx.xxx.xxx.xxx:7001
- Typing 'fs checkserver' on the affected client produces the following error:
These servers unavailable due to network or server problems: [affected server
hostname]
- Some other clients are able to access the server. I believe that this may be
due to the unaffected clients not accessing the volumes which were under
heavy use.
- Shutting down the AFS client and ensuring that the kernel module is removed,
then restarting the AFS client does not allow the affected client to access
the affected server.
- Re-starting the fs, volserver, ptsserver services on the affected server
alone does not allow the affected client to access the server. Shutting down
and then restarting the AFS service completely on the affected server also
has no effect on the affected client.
- Rebooting the affected client computer does allow it to access the affected
server.
- The servers are running OpenAFS 1.2.13, the affected client in this case is
also running 1.2.13. Older clients have also shown this behavior in the past.
- The firewall allows traffic initiated by the client, which tends to work.
This issue tends to happen every few months.
The affected system at this point is my workstation, and the affected server
does not contain volumes which I need to access directly. Thus, I'm willing
to keep it online until I can determine the cause of the issue. Does this
issue sound familiar to anyone?
Regards,
Lester Barrows
Asani Solutions, LLC
Code TI Systems Group
NASA Ames Research Center