[OpenAFS-devel] Solaris 10 predicament update

Dale Ghent daleg@umbc.edu
Sat, 8 Sep 2007 19:46:54 -0400


I'm using this email to report on the problem, what I've found, and  
lay out what our options are.

Background
With the advent of Solaris 10 8/07 (aka s10u4), the internal Private  
kernel interfaces AFS used to access network interface properties  
changed due to the integration of the pfhooks/netstack feature.  
Specifically, a argument was added to the ill_* functions to  
accommodate the netstack changes. This aspect negates their use vis a  
vis  maintaining AFS driver binary compatibility across all  
permutations of the Solaris kernel.

Situation
The AFS driver code uses the ILL_* macros and functions (defined in  
<inet/ip.h>) to walk a list of network interfaces and, as is the case  
in SetServerPrefs() in src/afs/afs_server.c, pick the best interface  
to bind to in order to talk to the AFS server holding the cell's root  
volume. They are also used in src/rx/SOLARIS/rx_knet.c to gather MTU  
settings of the interface a rx packet was received on, and uses the  
retrieved value to adjust the RX UDP packet size to prevent  
fragmentation.

My research has concluded that there are no straight-forward Public  
interfaces in the Solaris kernel which exist all the way back to  
Solaris 10 FCS. Also, there are no Private interfaces which directly  
address our needs and are stable back to Solaris 10 FCS.

What to do?
There are a few alternatives we can consider, and I'd like to present  
them for discussion... ordered from "most likely" to "least likely":

1) We can mimic what we've traditionally done and instead of using  
ILL_*,  use the Public ldi_ioctl() interface to make sockio calls to / 
dev/udp and fill Private structs with returned network interface  
information. While this may be alright to do in the case of  
SetServerPrefs(), it would be a huge performance impact in the rx  
code. When a rx UDP is received via , the call stack looks like this:

rxi_ReceivePacket->rxi_FindConnection->rxi_FindPeer- 
 >rxi_InitPeerParams()->rxi_FindIfMTU()->rxi_GetIFInfo()

Both rxi_FindIfMTU() and rxi_GetIFInfo() walk the ILL structs to get  
interface address and MTU and from what I can tell, it does this for  
*every* *received* *packet*. So, being that AFS seems rather  
obsessive about staying up-to-date on a interface's MTU, it would  
mean that we would be doing ioctls on a file (/dev/udp) for every rx  
packet we get. This would be hellishly expensive. Would this be a  
correct assumption?


2) Option 2 would be to use the above mentioned ioctl-based method,  
but to remove it entirely from the critical code path. We could, at  
AFSinit() time, create a worker thread which would periodically  
update a global struct of interface telemetry. The worker thread  
would wake up every, say, 30 seconds (tunable), lock the struct via  
mutex, update it, unlock, and return to sleep. The RX and  
ServerPredfs code can read their desired values from this struct when  
they need it, spinning if need be.


3) This is Rob's idea, so blame him if you reel back in horror. We  
find a conditional by testing for a netstack symbol in the kernel ip  
module. If TRUE, we have a pointer function that points to the new  
ILL_ functions with the extra argument. If FALSE, we point to the old  
ones. Yum. This would certainly involve the least amount of code.


4) We toss caution to the wind and let modern routers deal with UDP  
frags the way they should be and dispense with the UDP packet size  
adjustments based on MTU, or at least nail them to 1500. If you're  
still using AFS over a PPP connection... well... sorry 'bout that. We  
also let the kernel routing table do its job and dispense with  
selecting interfaces. I don't think even the NFS code jumps through  
these kinds of hoops. Is there a reason we should be? I admit I'm not  
too familiar with the inner details and history of things here, so  
feel free to gently clue me in.


5) Continue to use the ILL method and release OpenAFS 1.4.5 with the  
code being compatible with s10u4. We simply tell people that if you  
want to run OpenAFS client version 1.4.5 or greater, you also need to  
run Solaris KU 120012-14 (x86) or whatever the analog is if you're  
running SPARC.

6) Any other idears?

/dale


--
Dale Ghent
Specialist, Storage and UNIX Systems
UMBC - Office of Information Technology
ECS 201 - x51705