[OpenAFS] Re: Fair bandwidth distribution, performance of OpenAFS on win32

Andrew Deason adeason@sinenomine.net
Wed, 10 Nov 2010 20:23:20 -0600


First of all: versions? Server version is most interesting to me, but
the various clients would be good to know (in particular the Windows
ones).

On Wed, 10 Nov 2010 18:42:00 +0100
Matthias Gerstner <matthias.gerstner@esolutions.de> wrote:

> The problem I have is that time and again a user needs to perform
> rather performace killing operations on AFS. That is compiling a large
> software project with accessing a large number of small files and also
> producing a large number of files.

If possible: "don't do that" :) The user is going to get faster compiles
on local disk anyway, and I don't expect e.g. intermediary .o files to
be very important.

But aside from that...

> In this situation other users with moderate load on AFS suffer a bad
> user experience. That is not necessarily that absolute bandwidth is too
> low. It is rather that interactive work becomes annoying as operations
> on AFS tend to block repeatedly. For example writing a small text file
> can takes up to five seconds until suddenly everything goes back to
> normal.

What fileserver parameters are you running with? If you are running with
the defaults you are going to be sad. If you are running with the
defaults, the first suggestion is to try '-L -p 128' to see if that
improves things.

Also, on the linux client side, what options are you giving to afsd?

> I wonder what would be the best approach to improve the user experience
> for such cases. A low-level approach like extensions to the TCP/IP stack
> on the Linux server machine might be one example. But I feel that given
> the complexity of OpenAFS this is probably not the route to take.

Almost all of AFS is layered on top of UDP, so that's not going to help
you.

> So I thought maybe using the distribution properties of OpenAFS might
> be a better way. Are there any best practices for such a scenario?

If you know what volumes are giving you grief, you could move that
volume (or volumes) to be on their own small fileserver, which would
isolate their activity from the rest of the cell.

That's a bit drastic, though; I'd expect you can fix the performance
problems without isolating them like that. It depends on what
performance bottleneck you are hitting. If it's disk, you could put the
troublesome volumes just on another partition on the same server. If
you're running out of fileserver threads, '-p 128' should solve that. If
you're hitting bandwidth limits or RX performance limits, you can tweak
RX settings.

When it's being slow, try running 'rxdebug <fileserver>' and 'rxdebug
<fileserver> -rxstats' and put the results somewhere if you're
comfortable with sharing them.

> On a different matter I experience that running the OpenAFS client on
> MS Windows turns out generally really, *really* slow. From my knowledge
> MS Windows simply isn't very good regarding file operations even on
> local disks in comparison to UNIX systems. But working with OpenAFS on
> MS Windows is even worse than the usual. When comparing the performance
> between the Windows and the Linux OpenAFS clients I'm at least four
> times slower on Windows than on Linux. This can also be experienced
> during interactive work with the Windows client when operating on AFS.
> Question is if this is a known fact. Or if not so what I could do to
> relieve the problem.

Windows client version? (I won't have any answers for you, but I'm
pretty sure that info is helpful)

-- 
Andrew Deason
adeason@sinenomine.net