[OpenAFS] Re: Openafs performance with Apache/PHP

Nate Gordon nlgordon@gmail.com
Mon, 13 Aug 2007 16:46:41 -0500

So after having my server having yet another round of fits I'm going
to bring this back up.  My cache is currently hanging around 26GB
used, which makes some sense since I've probably got ~35GB of actual
content.  Is this too large of a cache?  I know previous versions of
openafs had issues with this.

I would be willing to try devel versions of things for testing.  I
would actually like to, except that I'm not a kernel genious and the
last time I tried 1.5 (1.5.21) I couldn't get it to compile against a
rhel5 kernel.

Is there any way to tune the afs client to deal better with lots and
lots of small file operations?

With the start of classes coming up soon this will mostly likely cause
my server to do very bad things on a very regular basis.  I'm willing
to contemplate any idea at least once.

On 7/27/07, Nate Gordon <nlgordon@gmail.com> wrote:
> So I'm finally getting around to querying about some issues I've been
> having for a long time with openafs 1.4.4 on rhel (3|4|5).  In short,
> the performance of my webservers serving up PHP code is less than
> stellar in some cases.  Those cases being defined as the requests per
> second getting above 30.  Here is an example/summary of the issue I'm
> having:
> I have a php page, that php page includes some other php files and
> makes two database queries.  If I point apache bench at that page on
> one of my test servers I get results as follows:
> 1 concurrent user: 16 requests per second 60ms per request
> 2 concurrent users: 27 requests per second 73ms per request
> 5 concurrent users: 32 requests per second 157ms per request
> 10 concurrent users: 10 requests per second 934ms per request
> 20 concurrent users: 5 requests per second 3666ms per request
> 30 concurrent users: 3 requests per second 10899 ms per request
> * requests per second being defined as the aggregate requests per
> second across all the concurrent users.  Thus 3 users at 3 requests
> per second each yields 9 requests per second aggregate.
> So as you can see performance drops off to an unusable point.  What I
> also noted was that while watching vmstat the number of context
> switches soars through the roof.  At the 10 users mark the switches
> hovers around ~200,000 per second.  At 20 users we go to ~250,000 and
> at 30 users we hit ~270,000 per second.  The machine I'm testing on is
> a Dell dual xeon 2.8Ghz hyperthreaded machine with 2GB of ram.  The
> machine isn't swapping during any of this and general memory usage
> remains fairly constant.  Afsd is configured with:
> AFSD_ARGS="-fakestat-all -chunksize 20 -volumes 400 -dcache 50000
> -files 200000 -stat 150000 -daemons 20"
> I've tried varying the various options to this to try to improve
> performance, but things only seem to get worse.  Cache size on this
> machine is currently 100MB and the total content being served is <
> 1MB.
> For completeness I've also tested this setup on a similar machine
> running rhel 5 with arla-current and have come up with some
> interesting results:
> 1 user: 22 requests per second 43ms per request
> 2 users: 32 requests per second 62ms per request
> 5 users: 36 requests per second 138ms per request
> 10 users: 36 requests per second 274ms per request
> 20 users: 37 requests per second 542ms per request
> 30 users: 36 requests per second 821ms per request
> This shows me exactly what I would expect to happen in on the server.
> Aggregate requests per second reach a peak performance level which
> increases the individual request time.
> Before anyone gets offended by anything I might have said, I want to
> make it absolutely clear that I am not trying to say that either
> openafs or arla is better/worse than the other.  I needed something to
> compare to for the purposes of this discussion.  I realize that the
> two are very very differently architected in their interactions with
> the kernel.  I would like to stick with openafs on my servers since
> that has official support from my systems group.
> So then I start to wonder about those differences in architecture.  Is
> there a point in openafs where it is essentially being single
> threaded?  Possibly in cache reading/validation?  Varying the number
> of daemons I'm running doesn't appear to affect the performance
> greatly until I drop down to something like 1 daemon.
> A little bit more about the PHP code I'm running and about PHP in
> general.  When you do a require_once in PHP to include a file it
> essentially stats every directory component between / and the full
> path to the file it is trying to include.  It does this on every
> include and does not cache the results.  It also does this in an
> attempt to locate the file if the include was done as a relative path,
> ie.:
> CWD="/afs/cell/folder/virtualhost/directory/page.php"
> include_path = ".:/usr/local/lib/php/:/afs/cell/folder/where/my/includes/are/"
> require_once("relative/path/include.inc.php");
> This would generate stat system calls in approximately this pattern:
> Check first item in include_path for include ("."):
> /
> /afs
> /afs/cell
> /afs/cell/folder
> /afs/cell/folder/virtualhost
> /afs/cell/folder/virtualhost/directory
> /afs/cell/folder/virtualhost/directory/relative <-- Fails since node
> doesn't exist
> Check second item in include_path for include ("/usr/local/lib/php/"):
> /
> /usr
> /usr/local
> /usr/local/lib
> /usr/local/lib/php
> /usr/local/lib/php/relative <-- Fails again for same reason
> Check third item in include_path ("/afs/cell/folder/where/my/includes/are"):
> /
> /afs
> /afs/cell
> /afs/cell/folder
> /afs/cell/folder/where
> /afs/cell/folder/where/my
> /afs/cell/folder/where/my/includes
> /afs/cell/folder/where/my/includes/are
> /afs/cell/folder/where/my/includes/are/relative
> /afs/cell/folder/where/my/includes/are/relative/path
> /afs/cell/folder/where/my/includes/are/relative/path/include.inc.php
> <-- Actual fopen
> Everything except the final fopen are lstat64 calls as indicated by strace.
> As you can see this is pretty expensive to do per page hit, especially
> when this is done repeatedly for every included file.  The page in
> question includes about 20 or so files.  This is one problem.  I've
> also put together the same page by pulling in all the code that would
> have been included to generate a self contained php file.  The
> performance tops out around 50 requests per second and still has the
> same dropoff pattern as you increase the number of simultaneous users
> hitting the page when using openafs.  So the problem isn't purley in
> the excessive amount of stat system calls.  If anything it shows that
> there is a problem somewhere else as well.  Possibly that it requires
> that the openafs daemon threads be switched in to answer lots of
> questions too often or that the threads have to talk to each other for
> every query to the afs file system.
> This topic of discussion might be more suited to the dev list, but I
> thought I would start here to see if anyone else was serving php files
> out of AFS and could reproduce my problems in their environment.
> If any more information is needed I would be glad to provide it.
> Thanks in advance,
> --
> -Nathan Gordon
> If the database server goes down and there is no code to hear it, does
> it really go down?
> <esc>:wq<CR>

-Nathan Gordon

If the database server goes down and there is no code to hear it, does
it really go down?