[OpenAFS-devel] Stack overflow

Marcus Watts mdw@umich.edu
Mon, 02 Aug 2010 11:40:00 -0400


Richard Low <rlow@acunu.com> writes:
> Date:    Mon, 02 Aug 2010 13:46:40 BST
> To:      openafs-devel@openafs.org
> From:    Richard Low <rlow@acunu.com>
> Subject: [OpenAFS-devel] Stack overflow
> 
> Hi all,
> 
> I'm using the openafs Rx code in a standalone application and have run into
> some multithreading issues.  It seems that the build on my system is not
> thread safe, and I get stack overflow errors whenever I have multiple client
> threads.  I also get the same error when using a server & client within the
> same process.
> 
> An example output:
> 
> stackcheck = 50462976: stack = 50462976
> topstack = 0x2cce43ec: stackptr = 0x7dde2010: stacksize = 0x48000
> Wed Jul 28 08:53:39 2010 LWP: stack overflow in process IO MANAGER!
> Aborted
> 
> My afs-sysname is amd64_linux26, running on CentOS 5.5.  My configure
> command:
> 
> ./configure --with-afs-sysname=amd64_linux26 --prefix=/usr
> --libdir=/usr/lib64 --bindir=/usr/bin --sbindir=/usr/sbin
> --disable-strip-binaries --with-krb5-conf=/usr/kerberos/bin/krb5-config
> --enable-redhat-buildsys --enable-transarc-paths
> 
> This shows on openafs version 1.4.12.1 and 1.5.75.  I've tried setting
> various #defines (such as USE_UCONTEXT as suggested earlier on this list)
> but haven't had any success.
> 
> Am I doing something wrong with my build?  Or has multithreaded support
> broken? I can provide a small C file that demonstrates the crash.
> 
> Many thanks,
> 
> Richard.

AFS supports 2 (well, really 3) thread models.  LWP, pthread, and also
the kernel.  Each has its advantages and disadvantages, which vary
somewhat by architecture.

You should not need to set USE_UCONTEXT.  The AFS libraries should
already be compiled with this set right.  If you actually need to vary it,
you'll need to change param.<sysname>.h and rebuild afs.  You probably
have this set already in any case.

If you want to use LWP, the thread stacksize is fixed at thread creation
time.  You can increase that size by setting the 2nd parameter to
LWP_CreateProcess to the value you desire.  You might find it useful
to do this in units of rx_stackSize, ie, rx_stackSize * 2.  It is
generally useful to limit the number and size of automatic variables,
and to avoid recursion or other techniques that might unnecessarily
bloat your stack needs.

In some versions of lwp, LWP_StackUsed can be called to return the
"high tide" mark of stack use.

Kernel mode threads have similar stack size limitations.

Sorting out library requirements between pthread/lwp, symbol visibility,
and versions of openafs can be a challenge.  It is best to avoid mixing
lwp and pthread libraries.  The sizes of many structures change, which
can cause unobvious bugs.  It is also bad to call pure lwp libraries
from an application which is in fact pthreaded.

For lwp,
[ -lvldb -lprot -lubik ] -lauth -lrxkad -lsys -lrx -llwp -ldes -lafsutil
Rough equivalent for pthread,
[ -lafsauthent ] -lafsrpc -lpthread
You probably don't need prot / afsauthent (are you making pt or vl calls?).
Symbol visibility in -lafsrpc is pretty ad-hoc.  Sometimes it's simplier to use
/usr/lib/libafsrpc.a
but you shouldn't use that if your code needs to be "pic".

					-Marcus Watts