[OpenAFS] 1.6.0-pre2 ptserver/vlserver dumping core

Ryan C. Underwood nemesis@icequake.net
Mon, 28 Feb 2011 19:18:07 -0600

On Mon, Feb 28, 2011 at 03:32:06AM -0500, Derrick Brashear wrote:
> same backtrace, or new one? the last looked like the double-free issue
> fixed in pre2. also, if nothing else is dying, that suggests LWP (and
> not pthreads) is what's relevant

Very similar, here's an example of the new one.  Seems like a
double-free is going on at the end of rxi_CleanupConnection() if the
glibc spewage in the log is to be trusted.  I can try to build a full
debug version if you think it would help.  Is it possible to valgrind?

#0  0xb7821424 in __kernel_vsyscall ()
#1  0xb76e6751 in raise () from /lib/i686/cmov/libc.so.6
#2  0xb76e9b82 in abort () from /lib/i686/cmov/libc.so.6
#3  0xb771d18d in ?? () from /lib/i686/cmov/libc.so.6
#4  0xb7727281 in ?? () from /lib/i686/cmov/libc.so.6
#5  0xb7728ad8 in ?? () from /lib/i686/cmov/libc.so.6
#6  0xb772bbbd in free () from /lib/i686/cmov/libc.so.6
#7  0x0807a3bd in rxi_CleanupConnection (conn=0xb77fdff4) at rx.c:991
#8  0x0807dcd4 in rxi_CheckCall (call=0x9586c10) at rx.c:6150
#9  0x0807e17d in rxi_GrowMTUEvent (event=0x0, arg1=0x9586c10, dummy=0x0) at rx.c:6382
#10 0x0808792d in rxevent_RaiseEvents (next=0xb75ecf6c) at rx_event.c:503
#11 0x08077b18 in rxi_ListenerProc (rfds=<value optimized out>, tnop=<value optimized out>, newcallp=<value optimized out>) at rx_lwp.c:203
#12 0x08077e6a in rx_ListenerProc (dummy=0x0) at rx_lwp.c:335
#13 0x08088981 in Create_Process_Part2 () at ./lwp.c:805
#14 0xb76f5cdb in makecontext () from /lib/i686/cmov/libc.so.6
#15 0x0d696910 in ?? ()
#16 0x08089108 in LWP_MwaitProcess (event=0x9611f60) at ./lwp.c:756
#17 LWP_WaitProcess (event=0x9611f60) at ./lwp.c:708
#18 0x08080b88 in rx_GetCall (tno=10, cur_service=0x95873a0, socketp=0xbffa8a2c) at rx.c:2062
#19 0x08080ddd in rxi_ServerProc (threadID=10, newcall=0x0, socketp=0xbffa8a2c) at rx.c:1654
#20 0x08077e0a in rx_ServerProc (unused=0x0) at rx_lwp.c:369
#21 0x08081488 in rx_StartServer (donateMe=1) at rx.c:793
#22 0x0804a8ba in main (argc=1, argv=0xbffa9064) at ptserver.c:565

I tried to valgrind and immediately got pages of errors, including
invalid writes, preceded by the following:

==17866== Memcheck, a memory error detector
==17866== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==17866== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
==17866== Command: /usr/lib/openafs/vlserver
==17866== Parent PID: 17864
==17866== Warning: client switching stacks?  SP change: 0xbef950ec --> 0x41d8438
==17866==          to suppress, use: --max-stackframe=1160000332 or greater
==17866== Warning: client switching stacks?  SP change: 0x41d827c --> 0xbef950ec
==17866==          to suppress, use: --max-stackframe=1159999888 or greater
==17866== Warning: client switching stacks?  SP change: 0xbef9512c --> 0x428f138
==17866==          to suppress, use: --max-stackframe=1160749068 or greater
==17866==          further instances of this message will not be shown.

Ryan C. Underwood, <nemesis@icequake.net>