[OpenAFS] Re: Openafs performance with Apache/PHP

Jason Edgecombe jason@rampaginggeek.com
Mon, 13 Aug 2007 19:02:02 -0400


Have you tried tinkering with the chunksize value? It's specified when 
afsd is run and tells the client how big of a chunk of each file to request.


Nate Gordon wrote:
> So after having my server having yet another round of fits I'm going
> to bring this back up.  My cache is currently hanging around 26GB
> used, which makes some sense since I've probably got ~35GB of actual
> content.  Is this too large of a cache?  I know previous versions of
> openafs had issues with this.
> I would be willing to try devel versions of things for testing.  I
> would actually like to, except that I'm not a kernel genious and the
> last time I tried 1.5 (1.5.21) I couldn't get it to compile against a
> rhel5 kernel.
> Is there any way to tune the afs client to deal better with lots and
> lots of small file operations?
> With the start of classes coming up soon this will mostly likely cause
> my server to do very bad things on a very regular basis.  I'm willing
> to contemplate any idea at least once.
> On 7/27/07, Nate Gordon <nlgordon@gmail.com> wrote:
>> So I'm finally getting around to querying about some issues I've been
>> having for a long time with openafs 1.4.4 on rhel (3|4|5).  In short,
>> the performance of my webservers serving up PHP code is less than
>> stellar in some cases.  Those cases being defined as the requests per
>> second getting above 30.  Here is an example/summary of the issue I'm
>> having:
>> I have a php page, that php page includes some other php files and
>> makes two database queries.  If I point apache bench at that page on
>> one of my test servers I get results as follows:
>> 1 concurrent user: 16 requests per second 60ms per request
>> 2 concurrent users: 27 requests per second 73ms per request
>> 5 concurrent users: 32 requests per second 157ms per request
>> 10 concurrent users: 10 requests per second 934ms per request
>> 20 concurrent users: 5 requests per second 3666ms per request
>> 30 concurrent users: 3 requests per second 10899 ms per request
>> * requests per second being defined as the aggregate requests per
>> second across all the concurrent users.  Thus 3 users at 3 requests
>> per second each yields 9 requests per second aggregate.
>> So as you can see performance drops off to an unusable point.  What I
>> also noted was that while watching vmstat the number of context
>> switches soars through the roof.  At the 10 users mark the switches
>> hovers around ~200,000 per second.  At 20 users we go to ~250,000 and
>> at 30 users we hit ~270,000 per second.  The machine I'm testing on is
>> a Dell dual xeon 2.8Ghz hyperthreaded machine with 2GB of ram.  The
>> machine isn't swapping during any of this and general memory usage
>> remains fairly constant.  Afsd is configured with:
>> AFSD_ARGS="-fakestat-all -chunksize 20 -volumes 400 -dcache 50000
>> -files 200000 -stat 150000 -daemons 20"
>> I've tried varying the various options to this to try to improve
>> performance, but things only seem to get worse.  Cache size on this
>> machine is currently 100MB and the total content being served is <
>> 1MB.
>> For completeness I've also tested this setup on a similar machine
>> running rhel 5 with arla-current and have come up with some
>> interesting results:
>> 1 user: 22 requests per second 43ms per request
>> 2 users: 32 requests per second 62ms per request
>> 5 users: 36 requests per second 138ms per request
>> 10 users: 36 requests per second 274ms per request
>> 20 users: 37 requests per second 542ms per request
>> 30 users: 36 requests per second 821ms per request
>> This shows me exactly what I would expect to happen in on the server.
>> Aggregate requests per second reach a peak performance level which
>> increases the individual request time.
>> Before anyone gets offended by anything I might have said, I want to
>> make it absolutely clear that I am not trying to say that either
>> openafs or arla is better/worse than the other.  I needed something to
>> compare to for the purposes of this discussion.  I realize that the
>> two are very very differently architected in their interactions with
>> the kernel.  I would like to stick with openafs on my servers since
>> that has official support from my systems group.
>> So then I start to wonder about those differences in architecture.  Is
>> there a point in openafs where it is essentially being single
>> threaded?  Possibly in cache reading/validation?  Varying the number
>> of daemons I'm running doesn't appear to affect the performance
>> greatly until I drop down to something like 1 daemon.
>> A little bit more about the PHP code I'm running and about PHP in
>> general.  When you do a require_once in PHP to include a file it
>> essentially stats every directory component between / and the full
>> path to the file it is trying to include.  It does this on every
>> include and does not cache the results.  It also does this in an
>> attempt to locate the file if the include was done as a relative path,
>> ie.:
>> CWD="/afs/cell/folder/virtualhost/directory/page.php"
>> include_path = ".:/usr/local/lib/php/:/afs/cell/folder/where/my/includes/are/"
>> require_once("relative/path/include.inc.php");
>> This would generate stat system calls in approximately this pattern:
>> Check first item in include_path for include ("."):
>> /
>> /afs
>> /afs/cell
>> /afs/cell/folder
>> /afs/cell/folder/virtualhost
>> /afs/cell/folder/virtualhost/directory
>> /afs/cell/folder/virtualhost/directory/relative <-- Fails since node
>> doesn't exist
>> Check second item in include_path for include ("/usr/local/lib/php/"):
>> /
>> /usr
>> /usr/local
>> /usr/local/lib
>> /usr/local/lib/php
>> /usr/local/lib/php/relative <-- Fails again for same reason
>> Check third item in include_path ("/afs/cell/folder/where/my/includes/are"):
>> /
>> /afs
>> /afs/cell
>> /afs/cell/folder
>> /afs/cell/folder/where
>> /afs/cell/folder/where/my
>> /afs/cell/folder/where/my/includes
>> /afs/cell/folder/where/my/includes/are
>> /afs/cell/folder/where/my/includes/are/relative
>> /afs/cell/folder/where/my/includes/are/relative/path
>> /afs/cell/folder/where/my/includes/are/relative/path/include.inc.php
>> <-- Actual fopen
>> Everything except the final fopen are lstat64 calls as indicated by strace.
>> As you can see this is pretty expensive to do per page hit, especially
>> when this is done repeatedly for every included file.  The page in
>> question includes about 20 or so files.  This is one problem.  I've
>> also put together the same page by pulling in all the code that would
>> have been included to generate a self contained php file.  The
>> performance tops out around 50 requests per second and still has the
>> same dropoff pattern as you increase the number of simultaneous users
>> hitting the page when using openafs.  So the problem isn't purley in
>> the excessive amount of stat system calls.  If anything it shows that
>> there is a problem somewhere else as well.  Possibly that it requires
>> that the openafs daemon threads be switched in to answer lots of
>> questions too often or that the threads have to talk to each other for
>> every query to the afs file system.
>> This topic of discussion might be more suited to the dev list, but I
>> thought I would start here to see if anyone else was serving php files
>> out of AFS and could reproduce my problems in their environment.
>> If any more information is needed I would be glad to provide it.
>> Thanks in advance,
>> --
>> -Nathan Gordon
>> If the database server goes down and there is no code to hear it, does
>> it really go down?
>> <esc>:wq<CR>