[OpenAFS] OpenAFS 1.3.87 and 1.4.0-rc6 stability issues on Solaris 10

Derrick J Brashear shadow@dementia.org
Thu, 13 Oct 2005 16:57:05 -0400 (EDT)


On Thu, 13 Oct 2005, Loic Tortay wrote:

> According to chas williams - CONTRACTOR:
>> In message <20051012003336.GA4896@ccali22.in2p3.fr>,Loic Tortay writes:
>>> "svcs -p" seems to be the tip of the iceberg, the machine also panics
>>> with "ctstat -v" (whether AFS was started automatically or not).
>>
>> "dont do that"
>>
> Easier said than done. :-)
>
> Both "svcs" and "ctstat" are non setuid binaries, and while it's easy
> to prevent users to execute these two programs, there are cases when at
> least "svcs -p" is useful especially for "root" or developers.
>
> I guess there are probably other commands triggering the panic.
>
> It's only a matter of time before some user tries one of these commands.
>
>>
>> it seems like this might be a bug in solaris10 when handling contracts
>> of exiting chilren who have created kernel threads.  the rxlistener is
>> a kernel thread on solaris and the child that starts the kernel_thread
>> returns and exits.
>>
>> try this patch.
>>
>> it cleans up the child process and seems to help things (the listener
>> thread seesm to join/attach to pid 0).
>>
> I applied your patch to 1.4.0-rc7 and it solves the problem, I have run
> both "svcs -p" and "ctstat -v" one after the other about 11000 times.
>
> Thank you so much.
>
> I submitted a bug report for this problem this afternoon, it's the
> ticket #22317.

Of course we don't have the answer to the stack size problem yet, right?