[OpenAFS-devel] Kernel Panic with openafs

Marcus Watts mdw@spam.ifs.umich.edu
Fri, 31 Aug 2007 15:07:52 -0400


writes Randy Philipp <randy@umbc.edu>
> Subject: [OpenAFS-devel] Kernel Panic with openafs
> openafs: afs global lock not held at 
> /afs/umbc.edu/users/k/h/kherna1/home/Build/rpmbuild/BUILD/openafs-1.4.3/src/lib
> afs/MODLOAD-2.6.18-8.1.8.el5-SP/afs_lock.c:133
...

Do you have a way to repeat this on demand?
Does it happen often?

This probably happened because in afs_pag_destroy at
src/afs/LINUX/osi_groups.c:596 the macro ISAFS_GLOCK
returned true, but when afs_FindUser invoked ObtainWriteLock
and then Afs_Lock_Obtain, the macro AFS_ASSERT_GLOCK failed
(which means ISAFS_GLOCK returned false there).

So, um.  Lock ownership is supposed to be determined by
storing the pid in afs_global_owner.
Is it possible in RHEL 5 for more than one kernel thread to have the same pid?

The actual lock is afs_global_lock, which for 2.6.18 should
be a "struct mutex".  If you build a kernel with CONFIG_DEBUG_MUTEXES
set, you get more debugging information on these, which might be
interesting.

In the kernel, a "task" is defined by a unique "struct thread_info"
so once you have CONFIG_DEBUG_MUTEXES set, it would be interesting to see
if it's possible to have
	current->pid == afs_global_owner
when
	current->thread_info != afs_global_lock->owner
Possibly it might be worth changing current->pid to current->thread_info
in all the GLOCK stuff.

					-Marcus Watts