[OpenAFS-devel] FreeBSD 5-current client work....

Garrett Wollman wollman@khavrinen.lcs.mit.edu
Thu, 25 Sep 2003 01:32:55 -0400 (EDT)


I'm currently trying to get the client working under FreeBSD
5-current.  I've tried to improve some of the significant
synchronization issues in the existing code, which is looking like the
hardest part of the project.  (I haven't looked at any of the VM
integration issues yet -- I've confirmed that pioctl(), fstat(), and
getdirentries() work in /afs, but that's about it.)  If I can stop
holding my nose at the stink of some of this code for a little bit,
these were the issues I have run into so far:

- Lots of potential for ISO C aliasing bugs; I've hacked around this
with -fno-strict-alias for now.

- It is essential that libafs.ko be compiled using the kernel's actual
option headers (opt_global.h in particular) and not fake ones.
Perhaps once this is debugged there will be less of a need for this.
FBSD/vfs_vnops.h in particular is bad juju.  I just modified
MakefileProto.FBSD.in to hack in the right cc options.  (There have
been a few times when I've hand-hacked config.status, too, to remove
bogus compiler options that I just haven't gotten around to figuring
out the configure tests for).

- Most of the necessary fixes were simply telling the FreeBSD port to
use the same serup as OpenBSD.  It will never be possible (or
sensible) for AFS to ``subclass'' struct vnode in FreeBSD.

- afs_xioctl() is utterly broken.  Thankfully it's also entirely
unnecessary.  I just removed it.

- AFS needs to implement VOP_PATHCONF() or else everyday utilities
like ls(1) get really unhappy.  (I don't know why ls needs pathconf(),
nor do I care to find out.  It Just Needs To Work.)

- The shutdown code is called with init's proc lock held, and at some
point it tries to sleep, which is not permitted (you can't sleep while
a mutex other than Giant is held).

- I've tried to reimplement most of osi_sleep.c, which was quite
nasty-looking.  In order to do that, while avoiding race conditions, I
turned afs_global_lock into afs_global_mtx.  This brought up the most
serious problem I've come up against: the AFS code loves to call
OSI_KALLOC() with the global lock held.  It also loves to call other
kernel entry points like namei() in similar situations.  I need to
have it not do that; how badly is the code likely to break if I drop
the global lock in these places?  (As noted in the last paragraph, you
can't sleep while a mutex other than Giant is held.  In addition, the
WITNESS lock debugging tool noted a lock order reversal [possible
deadlock] which dropping afs_global_mtx will probably resolve as a
side effect.)

- It seems as if, with a decent kernel memory allocator, almost all of
afs_osi_alloc() could be thrown out.  Thoughts on doing this?  This
would be similar to what I've done in FBSD/osi_sleep.c (in terms of
replacing bad emulations of kernel primitives -- condition variables
in that case -- with the real thing), and I suspect it would make
debugging much easier.

A slightly stale version of the diffs I've got now can be found at
<http://lcs.mit.edu/~wollman/openafs-fbsd52.patch>.

-GAWollman