[OpenAFS] Re: Openafs performance with Apache/PHP

Nate Gordon nlgordon@gmail.com
Mon, 13 Aug 2007 18:11:51 -0500


Current args to afsd:
AFSD_ARGS="-fakestat-all -chunksize 20 -volumes 400 -dcache 50000
-files 200000 -stat 150000 -daemons 20"

I believe the default chunksize is 16 or 18.  I might try going back
to that tomorrow when I get back into the office.  The majority of my
files are pretty small.

On 8/13/07, Jason Edgecombe <jason@rampaginggeek.com> wrote:
> Hi,
>
> Have you tried tinkering with the chunksize value? It's specified when
> afsd is run and tells the client how big of a chunk of each file to request.
>
> Jason
>
> Nate Gordon wrote:
> > So after having my server having yet another round of fits I'm going
> > to bring this back up.  My cache is currently hanging around 26GB
> > used, which makes some sense since I've probably got ~35GB of actual
> > content.  Is this too large of a cache?  I know previous versions of
> > openafs had issues with this.
> >
> > I would be willing to try devel versions of things for testing.  I
> > would actually like to, except that I'm not a kernel genious and the
> > last time I tried 1.5 (1.5.21) I couldn't get it to compile against a
> > rhel5 kernel.
> >
> > Is there any way to tune the afs client to deal better with lots and
> > lots of small file operations?
> >
> > With the start of classes coming up soon this will mostly likely cause
> > my server to do very bad things on a very regular basis.  I'm willing
> > to contemplate any idea at least once.
> >
> > On 7/27/07, Nate Gordon <nlgordon@gmail.com> wrote:
> >
> >> So I'm finally getting around to querying about some issues I've been
> >> having for a long time with openafs 1.4.4 on rhel (3|4|5).  In short,
> >> the performance of my webservers serving up PHP code is less than
> >> stellar in some cases.  Those cases being defined as the requests per
> >> second getting above 30.  Here is an example/summary of the issue I'm
> >> having:
> >>
> >> I have a php page, that php page includes some other php files and
> >> makes two database queries.  If I point apache bench at that page on
> >> one of my test servers I get results as follows:
> >>
> >> 1 concurrent user: 16 requests per second 60ms per request
> >> 2 concurrent users: 27 requests per second 73ms per request
> >> 5 concurrent users: 32 requests per second 157ms per request
> >> 10 concurrent users: 10 requests per second 934ms per request
> >> 20 concurrent users: 5 requests per second 3666ms per request
> >> 30 concurrent users: 3 requests per second 10899 ms per request
> >>
> >> * requests per second being defined as the aggregate requests per
> >> second across all the concurrent users.  Thus 3 users at 3 requests
> >> per second each yields 9 requests per second aggregate.
> >>
> >> So as you can see performance drops off to an unusable point.  What I
> >> also noted was that while watching vmstat the number of context
> >> switches soars through the roof.  At the 10 users mark the switches
> >> hovers around ~200,000 per second.  At 20 users we go to ~250,000 and
> >> at 30 users we hit ~270,000 per second.  The machine I'm testing on is
> >> a Dell dual xeon 2.8Ghz hyperthreaded machine with 2GB of ram.  The
> >> machine isn't swapping during any of this and general memory usage
> >> remains fairly constant.  Afsd is configured with:
> >> AFSD_ARGS="-fakestat-all -chunksize 20 -volumes 400 -dcache 50000
> >> -files 200000 -stat 150000 -daemons 20"
> >>
> >> I've tried varying the various options to this to try to improve
> >> performance, but things only seem to get worse.  Cache size on this
> >> machine is currently 100MB and the total content being served is <
> >> 1MB.
> >>
> >> For completeness I've also tested this setup on a similar machine
> >> running rhel 5 with arla-current and have come up with some
> >> interesting results:
> >>
> >> 1 user: 22 requests per second 43ms per request
> >> 2 users: 32 requests per second 62ms per request
> >> 5 users: 36 requests per second 138ms per request
> >> 10 users: 36 requests per second 274ms per request
> >> 20 users: 37 requests per second 542ms per request
> >> 30 users: 36 requests per second 821ms per request
> >>
> >> This shows me exactly what I would expect to happen in on the server.
> >> Aggregate requests per second reach a peak performance level which
> >> increases the individual request time.
> >>
> >> Before anyone gets offended by anything I might have said, I want to
> >> make it absolutely clear that I am not trying to say that either
> >> openafs or arla is better/worse than the other.  I needed something to
> >> compare to for the purposes of this discussion.  I realize that the
> >> two are very very differently architected in their interactions with
> >> the kernel.  I would like to stick with openafs on my servers since
> >> that has official support from my systems group.
> >>
> >> So then I start to wonder about those differences in architecture.  Is
> >> there a point in openafs where it is essentially being single
> >> threaded?  Possibly in cache reading/validation?  Varying the number
> >> of daemons I'm running doesn't appear to affect the performance
> >> greatly until I drop down to something like 1 daemon.
> >>
> >> A little bit more about the PHP code I'm running and about PHP in
> >> general.  When you do a require_once in PHP to include a file it
> >> essentially stats every directory component between / and the full
> >> path to the file it is trying to include.  It does this on every
> >> include and does not cache the results.  It also does this in an
> >> attempt to locate the file if the include was done as a relative path,
> >> ie.:
> >> CWD="/afs/cell/folder/virtualhost/directory/page.php"
> >> include_path = ".:/usr/local/lib/php/:/afs/cell/folder/where/my/includes/are/"
> >> require_once("relative/path/include.inc.php");
> >>
> >> This would generate stat system calls in approximately this pattern:
> >> Check first item in include_path for include ("."):
> >> /
> >> /afs
> >> /afs/cell
> >> /afs/cell/folder
> >> /afs/cell/folder/virtualhost
> >> /afs/cell/folder/virtualhost/directory
> >> /afs/cell/folder/virtualhost/directory/relative <-- Fails since node
> >> doesn't exist
> >> Check second item in include_path for include ("/usr/local/lib/php/"):
> >> /
> >> /usr
> >> /usr/local
> >> /usr/local/lib
> >> /usr/local/lib/php
> >> /usr/local/lib/php/relative <-- Fails again for same reason
> >> Check third item in include_path ("/afs/cell/folder/where/my/includes/are"):
> >> /
> >> /afs
> >> /afs/cell
> >> /afs/cell/folder
> >> /afs/cell/folder/where
> >> /afs/cell/folder/where/my
> >> /afs/cell/folder/where/my/includes
> >> /afs/cell/folder/where/my/includes/are
> >> /afs/cell/folder/where/my/includes/are/relative
> >> /afs/cell/folder/where/my/includes/are/relative/path
> >> /afs/cell/folder/where/my/includes/are/relative/path/include.inc.php
> >> <-- Actual fopen
> >>
> >> Everything except the final fopen are lstat64 calls as indicated by strace.
> >>
> >> As you can see this is pretty expensive to do per page hit, especially
> >> when this is done repeatedly for every included file.  The page in
> >> question includes about 20 or so files.  This is one problem.  I've
> >> also put together the same page by pulling in all the code that would
> >> have been included to generate a self contained php file.  The
> >> performance tops out around 50 requests per second and still has the
> >> same dropoff pattern as you increase the number of simultaneous users
> >> hitting the page when using openafs.  So the problem isn't purley in
> >> the excessive amount of stat system calls.  If anything it shows that
> >> there is a problem somewhere else as well.  Possibly that it requires
> >> that the openafs daemon threads be switched in to answer lots of
> >> questions too often or that the threads have to talk to each other for
> >> every query to the afs file system.
> >>
> >> This topic of discussion might be more suited to the dev list, but I
> >> thought I would start here to see if anyone else was serving php files
> >> out of AFS and could reproduce my problems in their environment.
> >>
> >> If any more information is needed I would be glad to provide it.
> >>
> >> Thanks in advance,
> >>
> >> --
> >> -Nathan Gordon
> >>
> >> If the database server goes down and there is no code to hear it, does
> >> it really go down?
> >> <esc>:wq<CR>
> >>
> >>
> >
> >
> >
>
>


-- 
-Nathan Gordon

If the database server goes down and there is no code to hear it, does
it really go down?
<esc>:wq<CR>