[OpenAFS] Suspect AFS bottlenecks on a web server
Jason Edgecombe
jason@rampaginggeek.com
Wed, 18 Nov 2009 20:00:30 -0500
Thanks, will do.
Derrick Brashear wrote:
> deploy 1.4.10 and that's worth poking
>
> Derrick
>
>
> On Nov 18, 2009, at 6:51 PM, Jason Edgecombe <jason@rampaginggeek.com>
> wrote:
>
>> Nate Gordon wrote:
>>> On Tue, Nov 17, 2009 at 6:25 PM, Jason Edgecombe
>>> <jason@rampaginggeek.com>wrote:
>>>
>>>
>>>> Derrick Brashear wrote:
>>>>
>>>>
>>>>> On Tue, Nov 17, 2009 at 5:09 PM, Jason Edgecombe
>>>>> <jason@rampaginggeek.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>> Hi Everyone,
>>>>>>
>>>>>> Our webserver has been brought to a crawl many times over the
>>>>>> last few
>>>>>> weeks. I suspect it's an AFS bottleneck somewhere. I appreciate
>>>>>> any help
>>>>>> I can get.
>>>>>>
>>>>>> The web server runs solaris 9 w/openafs 1.4.1.
>>>>>>
>>>>>>
>>>>>>
>>>>> is that correct?
>>>>>
>>>>> that's not even worth debugging. lots of things have been fixed since
>>>>> then, this could be something new or one of a dozen things already
>>>>> fixed.
>>>>>
>>>>>
>>>> Yes, 1.4.1 is correct.
>>>> I'm wondering if increasing the number of daemons would help. The
>>>> afsd man
>>>> page mentions that more than 5 or six daemons isn't helpful. I
>>>> suspect that
>>>> the number of apache daemons (75) is overwhelming the number of afsd
>>>> threads/daemons (5).
>>>>
>>>> https://lists.openafs.org/mailman/listinfo/openafs-info
>>>>
>>>>
>>>
>>> As someone who also runs AFS as the backend to a webserver, I can
>>> understand
>>> your problems. My problems stem more specifically from PHP on AFS
>>> and that
>>> PHP the language feels it is necessary to perform lots and lots of
>>> trivial
>>> stat operations. I have theorized that there are some global
>>> locking issues
>>> floating around the internals of the kernel module that cause
>>> problems on
>>> multithreaded systems under high load. Unfortunately I'm a web geek
>>> and
>>> less of a kernel programmer, so I have had limited success in
>>> tracking down
>>> and fixing the problem. Unfortunately I don't think daemons will be
>>> terribly useful. My understanding is that they aren't used in local
>>> cache
>>> operations, and only used for remote operations when things are getting
>>> behind. I'm currently running 6 daemons for 500 apache threads.
>>>
>>> I would also echo Derrick's comment on the age of the version you are
>>> using. I have noticed some significant improvements as the 1.4
>>> branch has
>>> gone on.
>>>
>>>
>> Thanks for the info about the daemons. We have lots of sites running
>> Joomla and PHP. I noticed a 5% vcache miss rate compared to a 1%
>> dcache miss rate on our web server. That corroborates your statement
>> about stat calls.
>>
>> Derrick, I have 1.4.10 with the
>> STABLE14-background-fsync-consistency-issues patch already compiled
>> and ready to deploy. Would that be new enough to consider debugging?
>>
>> I'm planning on upgrading our web server to 1.4.10 in December.
>>
>> Jason
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>