[OpenAFS] kernel panics with 1.6.0 and 1.6.1pre2 on openindiana

Logan O'Sullivan Bruns logan@gedanken.org
Fri, 27 Jan 2012 10:34:45 -0800


I'm setting up a couple of machines with OpenIndiana 151a. I'm adding them
to an existing cell. The file server side is working pretty well. I had 
some problems with the salvager crashing which were fixed by updating from
1.6.0 to 1.6.1pre2 and occasional problems with deleting of extant clones
not working properly which could be worked around. However I haven't been
been able to get the client side to run under load without a kernel panic.

With the 1.6.0 afsd and kernel module it runs until I put under load at which
time I get a kernel panic that looks like this:

ffffff000d1973c0 unix:die+10f ()                                                
ffffff000d1974d0 unix:trap+1573 ()                                              
ffffff000d1974e0 unix:cmninttrap+c2 ()                                          
ffffff000d197620 afs:afs_DynrootNewVnode+139 ()                                 
ffffff000d1976d0 afs:afs_GetVCache+459 ()                                       
ffffff000d197820 afs:afs_lookup+1bad ()                                         
ffffff000d197870 afs:gafs_lookup+5c ()                                          
ffffff000d197910 genunix:fop_lookup+ed ()                                       
ffffff000d197b50 genunix:lookuppnvp+28f ()                                      
ffffff000d197bf0 genunix:lookuppnatcred+11b ()                                  
ffffff000d197ce0 genunix:lookupnameatcred+e7 ()                                 
ffffff000d197d70 genunix:lookupnameat+69 ()                                     
ffffff000d197df0 genunix:cstatat_getvp+12b ()                                   
ffffff000d197e70 genunix:cstatat64_32+5c ()                                     
ffffff000d197ea0 genunix:fstatat64_32+4c ()                                     
ffffff000d197ec0 genunix:stat64_32+25 ()                                        
ffffff000d197f10 unix:brand_sys_syscall32+17a ()                                

With the 1.6.1pre2 afsd and kernel module it panics immediately upon afsd 
startup with a kernel panic that looks like this:

ffffff000d4037f0 unix:die+10f ()                                                
ffffff000d403900 unix:trap+1573 ()                                              
ffffff000d403910 unix:cmninttrap+c2 ()                                          
ffffff000d403a60 afs:afs_SetupVolume+30 ()                                      
ffffff000d403ab0 afs:afs_NewDynrootVolume+121 ()                                
ffffff000d403b90 afs:afs_CheckRootVolume+ce ()                                  
ffffff000d403bf0 afs:afs_Daemon+689 ()                                          
ffffff000d403c20 afs:afsd_thread+112 ()                                         
ffffff000d403c30 unix:thread_start+8 ()                                         

The general configuration is the same as I've used on a few solaris 10 sparc
systems with the 1.4.X series for a number of years. I'm not sure it is 
optimal but it has worked well for me. Basically a 2G ufs zvol for the cache
with parameters like this:

EXTRAOPTS="-afsdb -dynroot -nosettime -chunksize 19 -blocks 1000000"
LARGE="-stat 2800 -dcache 2400 -daemons 5 -volumes 128"

Any tips for workarounds or whether it is worth trying the latest source
from git would be appreciated.