[OpenAFS-devel] Re: progress... sortof...

Nickolai Zeldovich kolya@MIT.EDU
Wed, 28 Apr 2004 19:03:14 -0700


On Wed, 2004-04-28 at 20:36 -0500, Neulinger, Nathan wrote:
> if you take a look at some of the call counts with
> xstat_cm_test on some of my machines - even if the improvement is
> negligable, shrinking the code path on something executed so many times
> has got to have some improvement. The below numbers are from one of my
> machines that has been up for 4 days. 
> 
>            1456177 afs_open
>            2225818 osi_Read
> ...
>           30356351 afs_PutDCache
>           59459894 afs_PutVCache
> ...
>          131328561 afs_InitReq
>          131397954 PagInCred
>          135742910 afs_CopyOutAttrs

So assuming you gain maybe 100 cycles by inlining a function (I'd be a
bit surprised if it was that much), that figures to about 9 billion
cycles for inlining afs_PutDCache and afs_PutVCache, or about 9 seconds
saved CPU time (over 4 days) for a 1GHz machine.  If you divide this
into the number of times you opened AFS files, you get a savings of
about 6 micro-seconds :-)

The larger functions at the bottom seem to contain a bit more code than
afs_PutVCache/afs_PutDCache, and inlining them doesn't seem to be a
clear win either..

I'm rather doubtful that saving simple CPU instructions is what you need
to make AFS run fast.  AFS was slow on 500MHz machines way back when,
and it's still slow on 3GHz machines today, which are running all those
instructions at, let's say, 6x the speed.  What this tells you is that
you could've optimized AFS to run 6x fewer instructions but it would've
still been slow.

Something else is wrong with the AFS client -- performance profiling
should be done on it before you start making small optimizations.  One
way to go about this task, that I started playing with recently, is to
run the user-mode AFS client (libuafs) and profile that.  User-mode
profiling tools are much more widely available than kernel profilers,
and the code base is largely the same.  You just need to hook some
workload onto the user-space client.

If you (or anyone) wants to play with that approach to debugging AFS
client performance, you're welcome to look at my small example of using
UAFS:

    http://mit.edu/kolya/tmp/uafstest/

-- kolya