[OpenAFS] copying files to /afs filesystem hangs on system call close(4)

M. Jamal Green mgreen@odu.edu
02 Jun 2003 17:38:57 -0400


Jaime,

Thanks for your help. But we feel the delay of the close had something
to do with the kernel version we were running on the debian test server
(it was a 2.2.22..yeah I know..ancient).  Like you mentioned, we still
see fragmentation when looking at our tcpdumps but when we performed our
tests with the production Redhat AFS servers..the problem disappeared. 

To answer your question though, the answer is "no". The transfer of data
grams kicks off right away. Its still a mystery to us, but all of it
points to the old kernel under debian. Works like a champ under Redhat.

Thanks for your time in trying to help us out and pointing us in right
directions...we really appreciated it.




On Thu, 2003-05-29 at 19:08, Jaime Fournier wrote:
> Fragmentation is normal at the IP level with AFS, as
> it is DFS.
> It's the payload of the activity you are doing.
> If you start a StoreData then there will be one line
> of
> StoreData request, then as many IP Datagrams needed to
> 
> handle the data being transfered. 
> 
> What I am wondering is timing.
> Does the transfer of the datagrams only kick off when
> you close?
> If so that would explain the delay in the close, until
> the data has been commited to stable storage.
> 
> --- "M. Jamal Green" <mgreen@odu.edu> wrote:
> > Thanks for the response! One thing I forgot to
> > mention in my original
> > post was that the AFS Server is running Debian w/
> > OpenAFS 1.2.9 (I don't
> > know if that makes a difference) 
> > 
> > After using getting tcpdump while moving the same
> > file to a /afs volume
> > on a debian client and a redhat client it seems that
> > the redhat client
> > is sending fragmented packets to the Debian AFS
> > server while the debian
> > client is not.
> > 
> > Here's a snippet (redhat client to /afs )
> > 
> > 18:53:47.692451
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver >
> > redhat-afs-client.xxx.xxx.xxx.afs3-callback:  rx ack
> > (65)
> > 18:53:47.692713
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver >
> > redhat-afs-client.xxx.xxx.xxx.afs3-callback:  rx ack
> > (65)
> > 18:53:47.711513 redhat-afs-client.xxx.xxx.xxx >
> > debian-afs-server.xxx.xxx.xxx: udp (frag
> > 17886:1384@1480)
> > 18:53:47.711539
> > redhat-afs-client.xxx.xxx.xxx.afs3-callback >
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx
> > data (1472) (frag
> > 17886:1480@0+)
> > 18:53:47.711570 redhat-afs-client.xxx.xxx.xxx >
> > debian-afs-server.xxx.xxx.xxx: udp (frag
> > 17887:1384@1480)
> > 18:53:47.711601
> > redhat-afs-client.xxx.xxx.xxx.afs3-callback >
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx
> > data (1472) (frag
> > 17887:1480@0+)
> > 18:53:47.711680 redhat-afs-client.xxx.xxx.xxx >
> > debian-afs-server.xxx.xxx.xxx: udp (frag
> > 17888:1384@1480)
> > 18:53:47.711699
> > redhat-afs-client.xxx.xxx.xxx.afs3-callback >
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx
> > data (1472) (frag
> > 17888:1480@0+)
> > 18:53:47.711724 redhat-afs-client.xxx.xxx.xxx >
> > debian-afs-server.xxx.xxx.xxx: udp (frag
> > 17889:1384@1480)
> > 18:53:47.711742
> > redhat-afs-client.xxx.xxx.xxx.afs3-callback >
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx
> > data (1472) (frag
> > 17889:1480@0+)
> > 18:53:47.711813 redhat-afs-client.xxx.xxx.xxx >
> > debian-afs-server.xxx.xxx.xxx: udp (frag
> > 17890:1384@1480)
> > 18:53:47.711832
> > redhat-afs-client.xxx.xxx.xxx.afs3-callback >
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx
> > data (1472) (frag
> > 17890:1480@0+)
> > 18:53:47.711858 redhat-afs-client.xxx.xxx.xxx >
> > debian-afs-server.xxx.xxx.xxx: udp (frag
> > 17891:1384@1480)
> > 18:53:47.711876
> > redhat-afs-client.xxx.xxx.xxx.afs3-callback >
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx
> > data (1472) (frag
> > 17891:1480@0+)
> > 18:53:48.481876
> > redhat-afs-client.xxx.xxx.xxx.afs3-callback >
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx
> > data (1440) (DF)
> > 
> > 
> > I see this a lot in my tcpdump output... 
> > 
> > 
> > Here's another snippet (debian client to /afs)
> > 
> > 16:48:37.175811
> > debian-afs-client.xxx.xxx.xxx.afs3-callback >
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx
> > data (1436) (DF)
> > 16:48:37.175906
> > debian-afs-client.xxx.xxx.xxx.afs3-callback >
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx
> > data (1436) (DF)
> > 16:48:37.176639
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver >
> > debian-afs-client.xxx.xxx.xxx.afs3-callback:  rx ack
> > (66)
> > 16:48:37.176866
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver >
> > debian-afs-client.xxx.xxx.xxx.afs3-callback:  rx ack
> > (67)
> > 16:48:37.176877
> > debian-afs-client.xxx.xxx.xxx.afs3-callback >
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx
> > data (1436) (DF)
> > 16:48:37.176952
> > debian-afs-client.xxx.xxx.xxx.afs3-callback >
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx
> > data (1436) (DF)
> > 16:48:37.177848
> > debian-afs-client.xxx.xxx.xxx.afs3-callback >
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx
> > data (1436) (DF)
> > 16:48:37.177907
> > debian-afs-client.xxx.xxx.xxx.afs3-callback >
> > debian-afs-server.xxx.xxx.xxx.afs3-fileserver:  rx
> > data (1436) (DF)
> > 16:48:37.179792 okinawa.occs.odu.edu.afs3-fileserver
> > >
> > debian-afs-client.xxx.xxx.xxx.afs3-callback:  rx ack
> > (65)
> > 
> > no fragmenting at all in the tcpdump log.
> > 
> > Is this a known issue with AFS clients/servers ?
> > 
> > 
> > 
> > 
> > 
> > On Tue, 2003-05-27 at 13:38, Jaime Fournier wrote: 
> > > Can you run a tcpdump, or tethereal to see if it's
> > > sending packets?
> > > Maybe committing.
> > > See what packets are being set etc.
> > > I am willing to bet it's the commit on close.
> > > DFS semantics are going to be a lot different.
> > > 
> > > -- Jamal Green <mgreen@odu.edu> wrote:
> > > > Another Sys Admin and I have the privilege of
> > > > migrating our DCE/DFS 3.1 cell to an AFS cell.
> > We're
> > > > new to AFS and we cant
> > > > seem to figure this one out.
> > > > 
> > > > Currently we're experiencing strange behaviour
> > with
> > > > Redhat LINUX AFS
> > > > clients
> > > > using openafs 1.2.9. We do not seem to have this
> > > > problem with Debian.
> > > > 
> > > > The Redhat client is using Redhat 8.0, kernel
> > > > 2.4.18-27 and openafs 1.2.9.
> > > > We ran an strace while copying a 12MB file (a
> > > > tarred/gzipped file) from a
> > > > directory on an ext2 filesystem
> > > > and we see the file being copied and written
> > without
> > > > a problem (and very
> > > > quickly I
> > > > might add)...then this happens:
> > > > 
> > > > [420d6948] read(3,
> > > >
> > >
> >
> "\255\6\3602\211\35\r\216qB4\206\'\23#\21\245G\215\266N"...,
> > > > 4096) = 4096
> > > > [420d69c8] write(4,
> > > >
> > >
> >
> "\255\6\3602\211\35\r\216qB4\206\'\23#\21\245G\215\266N"...,
> > > > 4096) = 4096
> > > > [420d6948] read(3,
> > > >
> > >
> >
> "\201q\\.\312\31\233*r\10+\306\253\5(\324\363\353\346\n"...,
> > > > 4096) = 4096
> > > > [420d69c8] write(4,
> > > >
> > >
> >
> "\201q\\.\312\31\233*r\10+\306\253\5(\324\363\353\346\n"...,
> > > > 4096) = 4096
> > > > [420d6948] read(3,
> > > >
> > >
> >
> "2\235U\332\351\33\315\261\27\230Y\f*w\2\3016\353t\314k"...,
> > > > 4096) = 1129
> > > > [420d69c8] write(4,
> > > >
> > >
> >
> "2\235U\332\351\33\315\261\27\230Y\f*w\2\3016\353t\314k"...,
> > > > 1129) = 1129
> > > > [420d6948] read(3, "", 4096)            = 0
> > > > [420d68c1] close(4
> > > > 
> > > > It hangs at the close(4). We have to wait over
> > 10
> > > > minutes for the close to
> > > > complete. We've searched but we can't seem to
> > find a
> > > > solution to this
> > > > particular problem. Anybody else experienced
> > this
> > 
> === message truncated ===
> 
> 
> =====
> Jaime Fournier
> 
> __________________________________
> Do you Yahoo!?
> Yahoo! Calendar - Free online calendar with sync to Outlook(TM).
> http://calendar.yahoo.com