[OpenAFS-devel] Strange hangs in openafs 1.4.1 linux 2.6.17.7

Jerry Lundström jerry.lundstrom@it.su.se
Tue, 12 Sep 2006 10:28:06 +0200


Jerry Lundström wrote:
> Jeffrey Altman wrote:
>> The 1.4.1 server has a bug that results in significant delays in the
>> response to clients if there were outstanding callback breaks that cou=
ld
>> not previously be delivered to the client and the client's IP address =
or
>> the port number has changed.  This is the bug to which I was referring
>> which was fixed prior to 1.4.2-beta-1.
> 
> This doesnt not explain the afs_cv_wait hangs where I clearly see the
> response from the client in the tcpdump running on the client. Neither
> the ip address or the port was changed in that 0.1sec of the
> fetch-status request and response.

Sorry this meaning got all messed up.

When I run tcpdump on the client I see the fetch-status request being
sent to the server and I see the servers response to the client but the
process that sent the request has hanged in afs_cv_wait so the server
sents a couple more responses and after a few seconds a ping (or atleast
I think its a ping) but the process is still stuck in afs_cv_wait and I
can't strace the process or attach gdb on it.

This has happend both with memcache and filecache on the ramdisk but
never with the filecache running on a real drive.

-- 
Jerry Lundström, System Developer
The Division of IT and media, Stockholm University, Sweden
+46 (0)8 16 19 99 / http://www.it.su.se