[OpenAFS] copying files to /afs filesystem hangs on system call close(4)

M. Jamal Green mgreen@odu.edu
28 May 2003 19:47:15 -0400


Thanks for the response! One thing I forgot to mention in my original
post was that the AFS Server is running Debian w/ OpenAFS 1.2.9 (I don't
know if that makes a difference) 

After using getting tcpdump while moving the same file to a /afs volume
on a debian client and a redhat client it seems that the redhat client
is sending fragmented packets to the Debian AFS server while the debian
client is not.

Here's a snippet (redhat client to /afs )

18:53:47.692451 debian-afs-server.xxx.xxx.xxx.afs3-fileserver >
redhat-afs-client.xxx.xxx.xxx.afs3-callback:  rx ack (65)
18:53:47.692713 debian-afs-server.xxx.xxx.xxx.afs3-fileserver >
redhat-afs-client.xxx.xxx.xxx.afs3-callback:  rx ack (65)
18:53:47.711513 redhat-afs-client.xxx.xxx.xxx >
debian-afs-server.xxx.xxx.xxx: udp (frag 17886:1384@1480)
18:53:47.711539 redhat-afs-client.xxx.xxx.xxx.afs3-callback >
debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx data (1472) (frag
17886:1480@0+)
18:53:47.711570 redhat-afs-client.xxx.xxx.xxx >
debian-afs-server.xxx.xxx.xxx: udp (frag 17887:1384@1480)
18:53:47.711601 redhat-afs-client.xxx.xxx.xxx.afs3-callback >
debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx data (1472) (frag
17887:1480@0+)
18:53:47.711680 redhat-afs-client.xxx.xxx.xxx >
debian-afs-server.xxx.xxx.xxx: udp (frag 17888:1384@1480)
18:53:47.711699 redhat-afs-client.xxx.xxx.xxx.afs3-callback >
debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx data (1472) (frag
17888:1480@0+)
18:53:47.711724 redhat-afs-client.xxx.xxx.xxx >
debian-afs-server.xxx.xxx.xxx: udp (frag 17889:1384@1480)
18:53:47.711742 redhat-afs-client.xxx.xxx.xxx.afs3-callback >
debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx data (1472) (frag
17889:1480@0+)
18:53:47.711813 redhat-afs-client.xxx.xxx.xxx >
debian-afs-server.xxx.xxx.xxx: udp (frag 17890:1384@1480)
18:53:47.711832 redhat-afs-client.xxx.xxx.xxx.afs3-callback >
debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx data (1472) (frag
17890:1480@0+)
18:53:47.711858 redhat-afs-client.xxx.xxx.xxx >
debian-afs-server.xxx.xxx.xxx: udp (frag 17891:1384@1480)
18:53:47.711876 redhat-afs-client.xxx.xxx.xxx.afs3-callback >
debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx data (1472) (frag
17891:1480@0+)
18:53:48.481876 redhat-afs-client.xxx.xxx.xxx.afs3-callback >
debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx data (1440) (DF)


I see this a lot in my tcpdump output... 


Here's another snippet (debian client to /afs)

16:48:37.175811 debian-afs-client.xxx.xxx.xxx.afs3-callback >
debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx data (1436) (DF)
16:48:37.175906 debian-afs-client.xxx.xxx.xxx.afs3-callback >
debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx data (1436) (DF)
16:48:37.176639 debian-afs-server.xxx.xxx.xxx.afs3-fileserver >
debian-afs-client.xxx.xxx.xxx.afs3-callback:  rx ack (66)
16:48:37.176866 debian-afs-server.xxx.xxx.xxx.afs3-fileserver >
debian-afs-client.xxx.xxx.xxx.afs3-callback:  rx ack (67)
16:48:37.176877 debian-afs-client.xxx.xxx.xxx.afs3-callback >
debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx data (1436) (DF)
16:48:37.176952 debian-afs-client.xxx.xxx.xxx.afs3-callback >
debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx data (1436) (DF)
16:48:37.177848 debian-afs-client.xxx.xxx.xxx.afs3-callback >
debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx data (1436) (DF)
16:48:37.177907 debian-afs-client.xxx.xxx.xxx.afs3-callback >
debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx data (1436) (DF)
16:48:37.179792 okinawa.occs.odu.edu.afs3-fileserver >
debian-afs-client.xxx.xxx.xxx.afs3-callback:  rx ack (65)

no fragmenting at all in the tcpdump log.

Is this a known issue with AFS clients/servers ?





On Tue, 2003-05-27 at 13:38, Jaime Fournier wrote: 
> Can you run a tcpdump, or tethereal to see if it's
> sending packets?
> Maybe committing.
> See what packets are being set etc.
> I am willing to bet it's the commit on close.
> DFS semantics are going to be a lot different.
> 
> -- Jamal Green <mgreen@odu.edu> wrote:
> > Another Sys Admin and I have the privilege of
> > migrating our DCE/DFS 3.1 cell to an AFS cell. We're
> > new to AFS and we cant
> > seem to figure this one out.
> > 
> > Currently we're experiencing strange behaviour with
> > Redhat LINUX AFS
> > clients
> > using openafs 1.2.9. We do not seem to have this
> > problem with Debian.
> > 
> > The Redhat client is using Redhat 8.0, kernel
> > 2.4.18-27 and openafs 1.2.9.
> > We ran an strace while copying a 12MB file (a
> > tarred/gzipped file) from a
> > directory on an ext2 filesystem
> > and we see the file being copied and written without
> > a problem (and very
> > quickly I
> > might add)...then this happens:
> > 
> > [420d6948] read(3,
> >
> "\255\6\3602\211\35\r\216qB4\206\'\23#\21\245G\215\266N"...,
> > 4096) = 4096
> > [420d69c8] write(4,
> >
> "\255\6\3602\211\35\r\216qB4\206\'\23#\21\245G\215\266N"...,
> > 4096) = 4096
> > [420d6948] read(3,
> >
> "\201q\\.\312\31\233*r\10+\306\253\5(\324\363\353\346\n"...,
> > 4096) = 4096
> > [420d69c8] write(4,
> >
> "\201q\\.\312\31\233*r\10+\306\253\5(\324\363\353\346\n"...,
> > 4096) = 4096
> > [420d6948] read(3,
> >
> "2\235U\332\351\33\315\261\27\230Y\f*w\2\3016\353t\314k"...,
> > 4096) = 1129
> > [420d69c8] write(4,
> >
> "2\235U\332\351\33\315\261\27\230Y\f*w\2\3016\353t\314k"...,
> > 1129) = 1129
> > [420d6948] read(3, "", 4096)            = 0
> > [420d68c1] close(4
> > 
> > It hangs at the close(4). We have to wait over 10
> > minutes for the close to
> > complete. We've searched but we can't seem to find a
> > solution to this
> > particular problem. Anybody else experienced this
> > and found a work around?
> > 
> > We initially thought it was a caching problem
> > because the cache wound up on
> > an
> > ext3 filesystem. We rebuilt the workstation to use
> > ext2 only but that
> > hasn't
> > fixed it. We also think it's a timeout of some type,
> > but again we're not
> > experienced
> > enough with AFS to know for sure.
> > 
> > Any help or pointing in the right direction would be
> > GREATLY appreciated as
> > we
> > are under the gun to migrate our entire cell by the
> > end of the summer.
> > 
> > Thanks!
> > 
> > M. Jamal Green
> > UNIX Systems Administrator
> > Office of Computing and Communications Services
> > UNIX Support Group
> > [phone]:757-683-3678
> > [fax]:757-683-5155
> > http://www.lions.odu.edu
> > 
> > 
> > _______________________________________________
> > OpenAFS-info mailing list
> > OpenAFS-info@openafs.org
> > https://lists.openafs.org/mailman/listinfo/openafs-info
> 
> =====
> Jaime Fournier
> 
> __________________________________
> Do you Yahoo!?
> The New Yahoo! Search - Faster. Easier. Bingo.
> http://search.yahoo.com