[OpenAFS] linux kernel options ?

EC wingman@waika9.com
Thu, 25 Nov 2004 12:09:13 +0100


>
>>>> Ksymoops gives :
>>>>
>>>>>> EIP; c0110806 <wait_for_completion+66/ac>   <=====
>>>>
>>>> Trace; d88cd03b <[libafs-2.4.28.4es]afs_DaemonOp+6b/e0>
>>>> Trace; d88ccf90 <[libafs-2.4.28.4es]afsd_launcher+0/40>
>>>> Trace; d88cdb99 <[libafs-2.4.28.4es]afs_syscall_call+ae9/b10>
>>>> Trace; c01296f2 <__alloc_pages+6a/28c>
>>>> Trace; c0127f42 <lru_cache_add+5a/60>
>>>> Trace; c011f893 <do_wp_page+1af/1f0>
>>>> Trace; c011fe53 <handle_mm_fault+7b/b4>
>>>> Trace; c010f6e4 <do_page_fault+160/4a0>
>>>> Trace; d88cdddf <[libafs-2.4.28.4es]afs_syscall+1bf/210>
>>>> Trace; c011be20 <sys_setpriority+5c/d4>
>>>> Trace; c0106b27 <system_call+33/38>
>>>>
>>>> Or :
>>>>
>>>>>> EIP; c01105e6 <__wake_up+32/a4>   <=====
>>>>
>>>> Trace; d88edba0 <[libafs-2.4.28.4es]afs_waitForever+0/4>
>>>> Trace; d88ca5c0 <[libafs-2.4.28.4es]afs_osi_Wakeup+40/50>
>>>> Trace; d88ccc35 <[libafs-2.4.28.4es]afs_InitSetup+85/90>
>>>> Trace; d88edb30 <[libafs-2.4.28.4es]afs_InitSetup_done+0/4>
>>>> Trace; d88ccbc6 <[libafs-2.4.28.4es]afs_InitSetup+16/90>
>>>> Trace; d88cd077 <[libafs-2.4.28.4es]afs_DaemonOp+a7/e0>
>>>> Trace; d88cdb99 <[libafs-2.4.28.4es]afs_syscall_call+ae9/b10>
>>>> Trace; c01296f2 <__alloc_pages+6a/28c>
>>>> Trace; c0127f42 <lru_cache_add+5a/60>
>>>> Trace; c011f893 <do_wp_page+1af/1f0>
>>>> Trace; c011fe53 <handle_mm_fault+7b/b4>
>>>> Trace; c010f6e4 <do_page_fault+160/4a0>
>>>> Trace; d88cdddf <[libafs-2.4.28.4es]afs_syscall+1bf/210>
>>>> Trace; c011be20 <sys_setpriority+5c/d4>
>>>> Trace; c0106b27 <system_call+33/38>
>>>>
>>>
>>> What you see here is some kind of endless loop since the client
>>> threads
>>> are waiting during initialization for each other.
>>> One of them doesn't complete.
>>> The good question in this case is which one, and what does it do??
>>>
>>> Can you start afsd with -verbose -debug so we're able to see the
>>> syscalls?? Maybe there is something wrong there.
>>
>> OK. BTW : machine has 400MB physical RAM, 1GB SWAP, 170MB physical mem
>> free.
>> AFS cache is supposed to be ~100MB.
>>
>> Here's the log with -verbose -debug :
>>
>> Starting AFS services.....
>> afsd: My home cell is 'localdomain.com'
>> ParseCacheInfoFile: Opening cache info file '/etc/openafs/cacheinfo'...
>> ParseCacheInfoFile: Cache info file successfully parsed:
>>         cacheMountDir: '/srv/afs'
>>         cacheBaseDir: '/var/openafs/cache'
>>         cacheBlocks: 100000
>> afsd: Creating '/etc/openafs/AFSLog'
>> CreateCacheFile: Creating cache file '/etc/openafs/AFSLog'
>> afsd: 2400 inode_for_V entries at 0x8074728, 9600 bytes
>> SScall(137, 28, 17)=0 afsd: Forking rx listener daemon.
>> afsd: Forking rx callback listener.
>> afsd: Forking rxevent daemon.
>> SScall(137, 28, 36)=0 afsd: Calling AFSOP_CACHEINIT: 2800 stat cache
>> entries, 2400 optimum cache files, 19660800 blocks in the cache, flags
>> =
>> 0x1, dcache entries 2400
>> SScall(137, 28, 6)=0 afsd: Sweeping workstation's AFS cache directory.
>> afsd: Using memory cache, not swept
>> afsd: Calling AFSOP_CACHEINFO: dcache file is '/CacheItems'
>> afsd: Calling AFSOP_CELLINFO: cell info file is '/CellItems'
>> SScall(137, 28, 34)=0 SScall(137, 28, 29)=0 SScall(137, 28, 35)=0 afsd:
>> Forking AFS daemon.
>> afsd: Forking Check Server Daemon.
>> afsd: Forking 5 background daemons.
>> afsd: Calling AFSOP_VOLUMEINFO: volume info file is '/VolumeItems'
>> SScall(137, 28, 8)=2 afsd: Calling AFSOP_AFSLOG: volume info file is
>> '/etc/openafs/AFSLog'
>> afsd: Calling AFSOP_CACHEINODE for each of the 2400 files in ''
>> afsd: Calling AFSOP_GO with cacheSetTime = 0
>
>What fails here is the call to  AFSOP_VOLUMEINFO.
>
>I have no clue why. I expect "afs_InitVolumeInfo()" to cause the error
>but I have no idea what that might be in your environment.
>It is the cache initialization which goes wrong. Check your cache
>settings. (Is your disk cache directory there?? What file system is it
>on?? do you have enough space on that partition??)
>
>You can try memcache in order to make sure that it's the disk cache.

I'm using a memory cache. Afsd is launched with : 
# /usr/sbin/afsd -stat 2800 -dcache 2400 -daemons 5 -volumes 128 -nosettime
-memcache -verbose -debug

# cat /etc/openafs/cacheinfo
/srv/afs:/var/openafs/cache:100000
(BTW : I tried to change it to 10000=10MB with about 50MB free but... same
oops).

/srv/afs exists. So does /var/openafs/cache .

BTW : if I cut the server (don't know that is of some use to you) I get :

45# /usr/sbin/afsd -stat 2800 -dcache 2400 -daemons 5 -volumes 128
-nosettime -memcache -verbose -debug
afsd: My home cell is 'mydomain.com'
ParseCacheInfoFile: Opening cache info file '/etc/openafs/cacheinfo'...
ParseCacheInfoFile: Cache info file successfully parsed:
        cacheMountDir: '/srv/afs'
        cacheBaseDir: '/var/openafs/cache'
        cacheBlocks: 100000
afsd: 2400 inode_for_V entries at 0x8074728, 9600 bytes
SScall(137, 28, 17)=-1 afsd: Forking rx listener daemon.
afsd: Forking rx callback listener.
afsd: Forking rxevent daemon.
SScall(137, 28, 48)=-1 SScall(137, 28, 0)=-1 SScall(137, 28, 19)=-1
SScall(137, 28, 36)=-1 afsd: Error -1 in basic initialization.
afsd: Calling AFSOP_CACHEINIT: 2800 stat cache entries, 2400 optimum cache
files, 19660800 blocks in the cache, flags = 0x1, dcache entries 2400
SScall(137, 28, 6)=-1 afsd: Sweeping workstation's AFS cache directory.
afsd: Using memory cache, not swept
afsd: Calling AFSOP_CACHEINFO: dcache file is '/CacheItems'
afsd: Calling AFSOP_CELLINFO: cell info file is '/CellItems'
SScall(137, 28, 34)=-1 SScall(137, 28, 29)=-1 Adding cell 'foreversafe.com':
error -1
SScall(137, 28, 35)=-1 afsd: Forking AFS daemon.
afsd: Forking Check Server Daemon.
afsd: Forking 5 background daemons.
SScall(137, 28, 1)=-1 SScall(137, 28, 4)=-1 afsd: No check server daemon in
client.
SScall(137, 28, 2)=-1 SScall(137, 28, 2)=-1 SScall(137, 28, 2)=-1
SScall(137, 28, 2)=-1 afsd: Calling AFSOP_VOLUMEINFO: volume info file is
'/VolumeItems'
SScall(137, 28, 8)=-1 afsd: Calling AFSOP_AFSLOG: volume info file is
'/etc/openafs/AFSLog'
afsd: Calling AFSOP_CACHEINODE for each of the 2400 files in ''
afsd: Calling AFSOP_GO with cacheSetTime = 0
SScall(137, 28, 100)=-1 afsd: All AFS daemons started.
afsd: Forking trunc-cache daemon.
afsd: Mounting the AFS root on '/srv/afs', flags: 0.
SScall(137, 28, 2)=-1 SScall(137, 28, 3)=-1 afsd: Can't mount AFS on
/srv/afs(19)