[OpenAFS] OpenAFS Solaris 10 port
Jeff Woodward
Jeffrey.B.Woodward@Dartmouth.EDU
Tue, 16 Mar 2004 22:30:46 -0500 (EST)
For anyone interested in a port of OpenAFS to Solaris 10, I would be happy
to share my experiences, code base, and binaries. Presently, I have a
source tree based on the OpenAFS-1.2.10 distribution that compiles both
the 32 and 64 bit kernel modules using the Sun compiler under Sparc Solaris10.
I have done preliminary testing of the client using the 64 bit kernel module
which seems to be working well (see open issues below); I have tested
most of the major client utilities. I have not yet tested the AFS server
functionality on this platform (nor have I patched for the known "date
roll-over bug" that exists in the 1.2.10 server code base).
For those who care, more details are included below. If early access to
this work is desired, please send me email directly. If the requests are
overwhelming, I will make them AFS and web accessible and repost this
list; otherwise, I will be posting "proper" changes against the CVS tree
to the devel list for inclusion in some future release of OpenAFS.
-Jeff Woodward
Project Manager - Systems and Development
The fMRI Data Center
Dartmouth College
Overview of Current Work
------------------------
(*) Assigned SYS_NAME_ID_sun4x_510 to be 941 in src/config/afs_sysnames.h
(*) Created param.sun4x_510.h and param.sun4x_510_usr.h files in src/config
(*) Numerous changes to src/libafs/MakefileProto.SOLARIS.in to add
sun4x_510 system type.
(*) Add includes for cred_impl.h to src/afs/sysincludes.h -- caveat this
may not be the right thing to do; as it appears that cred_impl isn't
suppose to be publicly exposed; however, it seems to be compiling
and working for the moment but my guess is that it is subject to break
in future Solaris releases.
(*) Hack to src/venus/Makefile.in to pickup sysname sun4x_510 in order
to get the correct libraries for linking kdump.
(*) Method for traversing network interfaces has changed within the
Solaris "ip" kernel module -- resulting in changes to
src/afs/SOLARIS/osi_vfsops.c, src/afs/afs_server.c, and
rx/SOLARIS/rx_knet.c
(*) tv_usec and tv_sec changed(?) effecting struct timeval -- resulting in
changes to src/afs/afs_osi.h (I got away with typedef'ing
osi_timeval_t as a plain old struct timeval rather than using the
one included in the afs source with afs_int32 tv_sec and tv_usec typed
members -- no claim that this was the right choice).
(*) Solaris 10 seems to have made somewhat extensive changes to the VFS
interface. I am flying blind here as I don't have source code for
Solaris 10, so I based my changes on the comments and changes in the
sys/vfs.h and sys/vnode.h header files. These changes resulted in
modifications to src/afs/SOLARIS/osi_vfsops.c and
src/afs/SOLARIS/osi_vnodeops.c as well as src/afs/VNOPS/afs_vnop_read.c
and src/afs/VNOPS/afs_vnop_write.c. A little bit of clean up is still
needed here. Perhaps the most nebulous of the changes to VFS interface
is the addition of the 'caller_context_t*' parameter to the vop_read
and vop_write vnodeops which subsequently effects the VOP_READ and
VOP_WRITE macros. I am not an avid kernel hacker, so I don't know if
this construct exists in other operating systems, nor do I know what
necessitated this change on Sun's part. Nonetheless, I updated the
afs_vmread and afs_vmwrite function definitions accordingly but I
didn't add any code to utilize the caller_context. Likewise, I pass a
caller_context_t* parameter to VOP_READ and VOP_WRITE but I have no
idea "why" other than it is now part of the API. For now it seems to
be working (see also the open issues).
(*) Solaris 10 seems to have made some minor (?) changes to the sockfs
kernel interface. In particular, the sounbind() function is gone! This
resulted in changes to src/rx/SOLARIS/rx_knet.c. See open issues
below.
Open Issues
-----------
(*) libtermlib.a - Solaris 10 base no such library -- for now, it was
copied from a Solaris 9 system, but work should be done to eliminate
its dependency.
(*) vfsck is not compiling (so I removed it from Makefile) -- I am not
ready to test the server; in addition, I tend to configure servers with
the NAMEI interface. In short, vfsck is a very low priority for me.
(*) Since sounbind() is missing, the socket for RX on port 7001 remains
"mostly" bound after afs is shutdown and /afs is dismounted. I have
tried various strategies such as "shutting down" and "closing" the
socket, but even with that, it remains "bound" (as can be see in the
output of netstat -an). Attempting to restart afs without rebooting
either: 1) causes RX to fail to start preventing afs from starting,
2) appearance of everything starting, but I/O errors accessing /afs,
3) panic'ing the system. This may not be a big deal to most people
since it is only an issue if you attempt to stop afs and restart it
without rebooting. Many people claim that that never works anyway...I
see no reason for it to not work if I can get the RX port unbind'ed.
I have an email into the Solaris 10 kernel team for direction.
(*) I plan to email the Solaris 10 kernel team asking for more information
regarding the caller_context_t* parameters in vop_read/vop_write.
(*) proper conditional compilation of the changes noted above for the
Solaris 510 platform - currently, my source tree is not backwards
compatible with prior versions of Solaris...
Everything else is stuff that I have explicitly not tested (some I may,
others I may never test :-)
(*) HAVE NOT TESTED THE AFS SERVER COMPONENT - on my list of TO DOs
(*) PAM module - not yet tested.
(*) 32 bit sparc kernel module is untested (I don't have any sparcs that
I boot with a 32 bit kernel -- hint: I am not likely to ever test
this).
(*) x86 Solaris - have not attempted necessary changes for x86 arch. Code
changes are *probably* good to go, but the changes to Makefiles, param
files, and afs_sysname.h have not been attempted. Not sure that I
have Solaris 10 for x86 installed [yet]...
(*) nfs translator - not tested (I have never attempted to use the nfs
translator in any version of OpenAFS for any platform -- hint: I am
not likely to ever test this).
(*) memory cache - not yet tested but will do soon.
(*) dynamic roots - tested only briefly (seemed to work)
(*) there are probably a dozen other switches/options that I don't
normally use or would think to test.
Next Steps
----------
(*) Check out latest development branch from CVS and integrate changes
with "proper" conditional compilation to maintain backwards
compatibility.
(*) Integrate suggestions from the Solaris 10 kernel team (if any are
received).
(*) Test the AFS server component.
(*) It would be nice if vfsck would at least compile; however, I am not
eager to claim that it would work even if I got it to compile :-)
Hopefully somebody "more qualified" than I will help out here...
(*) Post patch files to the devel list for evaluation and [hopefully]
adoption.