[OpenAFS] Re: Advice on a use case

Andrew Deason adeason@sinenomine.net
Fri, 9 Nov 2012 15:47:35 -0600


On Fri, 9 Nov 2012 11:43:11 -0800
Timothy Balcer <timothy@telmate.com> wrote:

> > Creating lots of files is not fast. Due to the consistency
> > guarantees of AFS, you have to wait for at least a network RTT for
> > every single file you create. That is, a mkdir() call is going to
> > take at least 50ms if the server is 50ms away. Most/all recursive
> > copying tools will wait for that mkdir() to complete before doing
> > anything else, so it's slow.
>
> Yes.. I understand that. I was commenting on the slowness as compared
> to rsyncing over NFS, for example, which takes 5 hours for the entire
> tree when done from the top level of the tree. That tree contains 15
> of the directories that I mentioned in my earlier post. So 15 * 24k
> dirs.. and to answer the question, 232,974 files of small size for the
> one subdirectory in question.

I'm getting a little mixed up about when you're switching from talking
about the 'NFS solution' vs the 'AFS solution'. Are you saying it took
1.5 hours to transfer 15 subdirs into AFS, where it took NFS 5 hours to
transfer 24000 subdirs?

> > What can possibly make this faster, then, is copying/creating
> > files/directories in parallel. <snip>
> 
> Yes, I routinely run 100's of parallel transfers using a combination
> of tar and rsync.. tar gets the files over in raw form, and rsync mops
> up behind.  The rsync pass is to correct any problems with the tar
> copy, and is run twice on a fixed list, generated at transfer time. I
> have found that even when using a tuned rsync process designed to
> improve transfer speeds, many parallel tar/untar processes from local
> to  NFSv4 followed by a "local" rsync to the same destination works
> better for new files, when timeliness is important.

So... are you just talking about the NFS transfers, here?

With AFS, trying to rsync again afterwards is possibly much slower due
to cache churn if the number of files in question is larger than the
stat cache (as Jeff said earlier). You also cannot do more than 4
simultaneous 'things' to the AFS server with current client releases
(unless you fiddle with PAGs, or use some AFS-specific tools). So 100s
of parallel transfers aren't really helpful.

So if you want to write the tooling for it, I think ideally what you
would want is a cp/rsync-like tool that would copy files/dirs using a
separate thread for each file/dir up to some configured limit, tracking
dependencies so parent dirs are created first, possibly launching new
processes in separate PAGs as you go. Or, you could use utilities or
APIs that speak to AFS directly without going through the filesystem
layer (like afscp/afsio), so you wouldn't need separate threads or
processes.

Am I making any sense, here? What I describe above is the way to
transfer lots of little files into AFS more quickly (and possibly other
network filesystems depending on their consistency guarantees)
regardless of other schemes like OSD in play. Something like that would
be useful in general for people that want to copy a large tree or untar
something into AFS. As far as I know it does not already exist, but I
could be wrong.

-- 
Andrew Deason
adeason@sinenomine.net