[OpenAFS-devel] kernel BUG at /scratch/openafs/src/libafs/MODLOAD-2.6.13-MP/rx_kcommon.c:131!

Tim Spriggs tims@lpl.arizona.edu
Wed, 07 Sep 2005 20:02:41 -0700


Jeffrey Altman wrote:

>Tim Spriggs wrote:
>
>  
>
>>chas williams - CONTRACTOR wrote:
>>
>>
>>    
>>
>>>In message <DFC9582579DBA29B34C84527@endicott>,Chaskiel M Grundman writes:
>>>
>>>
>>>
>>>      
>>>
>>>>>+			atomic_set(&dp->d_count, 1);
>>>>>    
>>>>>
>>>>>          
>>>>>
>>>>Is that really appropriate? You didn't make this dentry up yourself. It is
>>>>presumably the same dentry as afs_globalVfs->s_root, and may be the current
>>>>directory of any number of processes, etc.
>>>>  
>>>>
>>>>        
>>>>
>>>yes.  the only time this code does anything is when you switch over
>>>      
>>>
>>>from the RW afs.root to the RO afs.root.  can you get a reference to a
>>
>>    
>>
>>>RO path when the root of the search is a RW path?  ergo the new root
>>>inode should not have any existing lookups/references so d_count for
>>>/afs should become 1.
>>>
>>>yes, this seems wrong but linux doesnt seem to expect a filesystem to
>>>want to change the root mount point on the fly.  i hestitate to d_drop,
>>>d_alloc_root() and the change sb->s_root.  i have children dentries
>>>pointing to the current root dentry.
>>>
>>>this is also a fairly unique situation.  only happens when you are
>>>setting up a new cell.
>>>
>>>
>>>      
>>>
>>I don't know if this is supposed to be supported or not but I will
>>authenticate to two different cells and be able to use both in linux.
>>
>>I have two interesting issues:
>>
>>thomas:/afs/lpl.arizona.edu# kinit tims@LPL.ARIZONA.EDU
>>Password for tims@LPL.ARIZONA.EDU:
>>thomas:/afs/lpl.arizona.edu# aklog -d lpl.arizona.edu
>>Authenticating to cell lpl.arizona.edu (server isaac.home.tajinc.org).
>>We've deduced that we need to authenticate to realm HOME.TAJINC.ORG.
>>Getting tickets: afs/lpl.arizona.edu@HOME.TAJINC.ORG
>>Kerberos error code returned by get_cred: -1765328377
>>aklog: Couldn't get lpl.arizona.edu AFS tickets:
>>aklog: Server not found in Kerberos database while getting AFS tickets
>>
>>
>>This may be because I don't have my kerberos config [domain_realm]
>>section filled out but it has worked in the past.
>>If I give aklog lpl.ariona.edu -k LPL.ARIZONA.EDU then everything works
>>just fine.
>>
>>Anyways, the more interesting issue comes when I try accessing the
>>lpl.arizona.edu cell (after I have authenticated).
>>I get the directory structure that belongs inside home.tajinc.org.
>>
>>I don't know if this is relevant but home.tajinc.org is inside a private
>>network.
>>
>>This seemed to pertain to this thread but seemed a little more broad, so
>>I hope this may help.
>>
>>Thanks,
>>-Tim
>>    
>>
>
>Check the hostname that is being returned in response to your DNS
>queries.  I will bet that .home.tajinc.org is being appended to all
>queries.  Therefore, you are obtaining a domain/realm mapping to
>HOME.TAJINC.ORG and you are obtaining VLDB server entries for the
>home.tajinc.org cell.
>
>Jeffrey Altman
>
>  
>
nope, I modify my resolv.conf s.t. lpl.arizona.edu as well as
home.tajinc.org is searched. When I do my query for afsone I get
afsone.lpl.arizona.edu (correct).

Also, I am using very different names for the machines in each cell.

I have the extra DNS entries so that all afs/kerberos information can be
gathered automagically by newer clients.

The only thing I have changed is my client openafs installation. The
config hasn't been changed but the kernel modules have and the client
binaries have been updated as well.


Oh, another interesting feature is that when my afs tokens expire and I
try to do a read(or perhaps many reads,) I lose the ability to cd into
my root cell on the client. This is probably related to the other major
thread (lost contact with fileserver).

I can re-create this by making a token for a short amount of time,
opening an application (say xmms) and letting the application cycle
through files past the time when I should be allowed to access the
files. Right when the tokens expire, the application will hang for maybe
30 seconds(the music stops), and then I get the "lost connection with
fileserver" error(dmesg output) and then the application tries to cycle
through other files unsuccessfully. At this point I can't see the cell:

bink@thomas:/afs$ cd home.tajinc.org/
bash: cd: home.tajinc.org/: No such file or directory
bink@thomas:/afs$ ls
home.tajinc.org  lpl.arizona.edu

Restarting the openafs-client init.d script seems to be the only way to
resolve the problem.

Hope this helps,
-Tim