[OpenAFS-devel] Solaris fixes for 1.4.x / AFS_SUN510_ENV

Frank Batschulat (Home) Frank.Batschulat@Sun.COM
Mon, 11 Feb 2008 19:23:34 +0100


On Mon, 11 Feb 2008 16:31:23 +0100, Derrick Brashear <shadow@gmail.com> wrote:

> On Feb 11, 2008 9:32 AM, Frank Batschulat (Home)
> <Frank.Batschulat@sun.com> wrote:
>> On Wed, 30 Jan 2008 20:44:34 +0100, Jeffrey Hutzelman <jhutz@cmu.edu> wrote:
>>
>> > --On Wednesday, January 30, 2008 06:14:02 PM +1100 Mike Battersby
>> > <mib@unimelb.edu.au> wrote:
>> >
>> >> 1. SSYS process exiting considered harmful
>> >>
>> >>   The first problem is that setting process flag SSYS on a process that
>> >>   exits, as the afs_osi_Invisible routine on Solaris 10 does, causes the
>> >>   system not to clean up the contract state of the process.  This leaves
>> >>   a dangling kernel-memory pointer in the contract table which used to
>> >>   point to the process struct.
>> >>
>> >>   Any user can corrupt kernel memory and cause a panic with the 'ctstat'
>> >>   command and the system cannot shut down without either panicing or
>> >>   going into an infinite loop as svc.startd repeatedly tries to kill the
>> >>   non-existent process.
>> >>
>> >> I really don't know why the code would set SSYS on a userland process
>> >> that's about to exit in the first place.  Can anyone shed any light?
>> >
>> > Threads that call afs_osi_Invisible are not about to exit; they're about to
>> > become long-lived AFS kernel threads.  Setting SSYS is correct; we just
>>
>> Actually it is not appropriate for an arbitrary thread/proc to set SSYS.
>>
>> Only system processes [they exist only in kernel, i,e p_as is set to kas]
>> created with newproc() are eligible for SSYS, and that happens automatically in newproc().
>
> This is a system process, just not one created by newproc().

actually there are only a few 'system processes' and these are sched, init, pageout, fsflush,
zsched and the cluster_wrapper. there are no other 'system processes' in that term.

refer to main() in
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/main.c

regular kernel threads are parented to sched (p0) while zone specific kernel threads
created by zthread_create() are parented to zsched.

> Presumably we need to do something analogous to the linux
> kernel_thread code, calling newproc.

nope, we've been there before:

http://www.openafs.org/pipermail/openafs-devel/2002-April/007896.html

I wonder what are you trying to accomplish by setting SSYS ? and I'm still
unclear if you are doing this to a kernel thread or a user land process.

---
frankB