[OpenAFS-devel] Solaris fixes for 1.4.x / AFS_SUN510_ENV

Derrick Brashear shadow@gmail.com
Mon, 11 Feb 2008 13:28:51 -0500


> >> >> 1. SSYS process exiting considered harmful
> >> >>
> >> >>   The first problem is that setting process flag SSYS on a process that
> >> >>   exits, as the afs_osi_Invisible routine on Solaris 10 does, causes the
> >> >>   system not to clean up the contract state of the process.  This leaves
> >> >>   a dangling kernel-memory pointer in the contract table which used to
> >> >>   point to the process struct.
> >> >>
> >> >>   Any user can corrupt kernel memory and cause a panic with the 'ctstat'
> >> >>   command and the system cannot shut down without either panicing or
> >> >>   going into an infinite loop as svc.startd repeatedly tries to kill the
> >> >>   non-existent process.
> >> >>
> >> >> I really don't know why the code would set SSYS on a userland process
> >> >> that's about to exit in the first place.  Can anyone shed any light?
> >> >
> >> > Threads that call afs_osi_Invisible are not about to exit; they're about to
> >> > become long-lived AFS kernel threads.  Setting SSYS is correct; we just
> >>
> >> Actually it is not appropriate for an arbitrary thread/proc to set SSYS.
> >>
> >> Only system processes [they exist only in kernel, i,e p_as is set to kas]
> >> created with newproc() are eligible for SSYS, and that happens automatically in newproc().
> >
> > This is a system process, just not one created by newproc().
>
> actually there are only a few 'system processes' and these are sched, init, pageout, fsflush,
> zsched and the cluster_wrapper. there are no other 'system processes' in that term.
>
> refer to main() in
> http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/main.c
>
> regular kernel threads are parented to sched (p0) while zone specific kernel threads
> created by zthread_create() are parented to zsched.
>
> > Presumably we need to do something analogous to the linux
> > kernel_thread code, calling newproc.
>
> nope, we've been there before:
>
> http://www.openafs.org/pipermail/openafs-devel/2002-April/007896.html
>
> I wonder what are you trying to accomplish by setting SSYS ? and I'm still
> unclear if you are doing this to a kernel thread or a user land process.

afsd->afs syscall() and then SSYS is set. Before the syscall returns,
SSYS is cleared. I don't have notes handy but I assume this was "we
really aren't interested in being signalled while we're in the
kernel". I guess then (if that's really it) lwp_sigmask, or switch to
real (not newproc) kernel threads.