[OpenAFS-devel] pthreads, darwin, microsoft: untwisting the knot

Marcus Watts mdw@umich.edu
Thu, 26 May 2005 23:04:05 -0400


I ran into an interesting library issue, which I think may be of
interest to others.  The problem is linking rx WITHOUT making any
use of rxkad or des.  One way or another, this is likely to become
common, and there's already code in openafs that does just this.  I'd
like to make this less "magical".

The obvious symptom is that if you link such code on darwin you will
usually get one or more of these undefined:
	des_init_mutex
	des_random_mutex
	rxkad_random_mutex
	rxkad_client_uid_mutex
	rxkad_stats_mutex

The problem is that in the rx library, there is a routine
	rxi_InitPthread
which initializes about 20 mutexes.  Many of these are part of rx.  But
these 5 mutex are not; they are part of libraries under rx.

On most reasonable architetures, that's no big deal.  The mutex is
defined as an uninitialized data, which results in a ".comm" global
symbol, which causes the linker to load the object that contains it,
and all is well--assuming you don't mind code bloat that is.

Apparently, the Apple folks took exception to this, so they modified
"ranlib" to NOT add such symbols to the library table of contents.  I
guess their arguement is if you aren't referencing any of the code, why
do you need the file in the first place?  Since they brag about this in
their man page, it seems unlikely they'll "fix" this bug, but they did
include "ranlib -c" which fixes .comm handling.  This was almost
useful to me, and might be worth doing anyways for openafs.

Anyways, if you look at openafs code such as
	src/libadmin/samples/cm_client_config.c
	src/libadmin/test/afscp.c
you'll find it has something that reads:
	#ifdef AFS_DARWIN_ENV
and then the above list of mutexes.  Actually
it reads more like this:
pthread_mutex_t des_init_mutex = PTHREAD_MUTEX_INITIALIZER;

This is interesting.  On any "reasonable" pthread library, this exact
mechanism can be used to initialize a pthread_mutex_t variable quite
nicely right where it's allocated, and there's no need for rxi_InitPthread
to even exist.

So far as I know, this covers every architecture currently supported by
OpenAFS.  Except one.  Windows.  The MS windows openafs code has its
own pthreads emulation library on top of native win32 thread support,
which does not include PTHREAD_MUTEX_INITIALIZER .  Bummer.
So, in the hopes that this is the *only* reason for this unfortunate
construction, I have a possible fix.

The problem with Windows NT is apparently there's no static
initializer with the same effect as:
	InitializeCriticalSection()
that means only dynamic initialization of the guts of a pthread_mutex_t will
work.  That's certainly a problem.

The fix: modify pthread_mutex_t to have a new field, "IsInitialized".
[ It might be wise to include padding to some round boundary as well to
make library versioning a bit simplier. ]  Declare PTHREAD_MUTEX_INITIALIZER
to put a 0 into this field.  Modify
	pthread_mutex_init
	pthread_mutex_trylock
	pthread_mutex_lock
	pthread_mutex_unlock
	pthread_mutex_destroy
to check this.  Generally speaking, if it's not set, acquire a private
mutex, and if it's still not set, call InitializeCriticalSection for
the mutex.  Use a single CRITICAL_SECTION variable private to pthread.c
for the private mutex, and use pthread_once to initialize it.
Obviously, pthread_mutex_init and pthread_mutex_destroy can collapse
some of this logic.  Making the "set" value be some magic number would
make it possible to detect "garbage" and produce useful diagnostics.

pthread_once() has another race condition as well which is described
in its associated comment.  I think this might benefit from being
protected by another CRITICAL_SECTION, which in turn could be
initialized by using a tiny c++ stub with a constructor designed
to run at loadtime.  I don't know if this is worth it.

So is MS windows truely the only OS with this problem?
Anybody see a problem with making use of PTHREAD_MUTEX_INITIALIZER
to make openafs more modular?

Caveat: I'm not a MS windows programmer.  I can very likely find
one who will volunteer if that's useful.

				-Marcus Watts
				UM ITCS Umich Systems Group