[OpenAFS] fileserver coredumping

J S vervoom@hotmail.com
Mon, 19 Apr 2004 14:47:43 +0000


>On Monday, April 19, 2004, at 04:10 PM, J S wrote:
>
>>
>>
>>>
>>>On Monday, April 19, 2004 12:41:49 +0000 J S <vervoom@hotmail.com> wrote:
>>>
>>>># dbx /usr/afs/bin/fileserver corefile.fs
>>>>Type 'help' for help.
>>>>reading symbolic information ...warning: no source compiled with -g
>>>>
>>>>[using memory image in corefile.fs]
>>>>
>>>>IOT/Abort trap in pthread_kill at 0xd0014af8 ($t16)
>>>>0xd0014af8 (pthread_kill+0x80) 80410014        lwz   r2,0x14(r1)
>>>>(dbx) where
>>>>pthread_kill(??, ??) at 0xd0014af8
>>>>_p_raise(??) at 0xd0013eac
>>>>raise.raise(??) at 0xd018792c
>>>>abort() at 0xd0180400
>>>>AssertionFailed() at 0x1000594c
>>>>FSYNC_sync() at 0x1004499c
>>>>_pthread_body(??) at 0xd00080c8
>>>>(dbx)
>>>
>>>
>>>This is a little odd...
>>>
>>>This backtrace suggests that an assertion failed in FSYNC_sync().  The 
>>>only assert in FSYNC_sync occurs if the fileserver is unable to bind the 
>>>fssync port after trying for about 25 seconds.  If you see this assert, 
>>>you should also see 5 messages in the log about failing to bind the port; 
>>>these should include an error code thay may point you in the right 
>>>direction.
>>>
>>>Is it possible you already have another fileserver running, or something 
>>>else bound or connected to port 2040 ?
>>>
>
>
>Jeffrey was right. Somehow he's always right :-)
>
>>
>>No I don't think it's that:
>>
>># netstat -a | grep 2040
>># ps -ef | grep fileserver
>>    root 101358  54276   1 15:05:12  pts/6  0:00 grep fileserver
>>    root 101398  94564   0 15:03:58      -  0:00 /usr/afs/bin/fileserver
>>
>>But... cat Filelog shows:
>>
>>Mon Apr 19 15:05:08 2004 Getting FileServer name...
>>Mon Apr 19 15:05:08 2004 FileServer host name is 'bspc1n11'
>>Mon Apr 19 15:05:08 2004 Getting FileServer address...
>>Mon Apr 19 15:05:08 2004 FileServer bspc1n11 has address 172.30.4.11 
>>(0xac1e040b or 0xac1e040b in host byte order)
>>Mon Apr 19 15:05:08 2004 File Server started Mon Apr 19 15:05:08 2004
>>Mon Apr 19 15:05:13 2004 FSYNC_sync: bind failed with (68), will sleep and 
>>retry
>>Mon Apr 19 15:05:18 2004 FSYNC_sync: bind failed with (68), will sleep and 
>>retry
>>Mon Apr 19 15:05:23 2004 FSYNC_sync: bind failed with (68), will sleep and 
>>retry
>>Mon Apr 19 15:05:28 2004 FSYNC_sync: bind failed with (68), will sleep and 
>>retry
>>
>>
>>It should be connecting to bspc1n11e (which is on a different IP address) 
>>not bspc1n11. Do you know how I can fix this? If I do vol listaddrs it 
>>shows both bspc1n11 and bspc1n11e. Should I do vos changeAddr -remove 
>>bspc1n11 ?
>>
>>Thanks for your help. By the way how do I check the FileServer version?
>>
>
>rxdebug -version <servername>
>
>If you have some kind of 'virtual' IP adresses make shure the fileserver 
>binds itself to the right one. I had a lot of trouble with that.
>
>
>Horst
>


Thanks Horst,

# rxdebug -version bspc1n11e
Trying 161.2.249.17 (port 7000):
AFS version:  OpenAFS 1.2.10 built  2003-10-10
# rxdebug -version bspc1n11
Trying 172.30.4.11 (port 7000):
AFS version:  OpenAFS 1.2.10 built  2003-10-10

How do I force it to bind to bspc1n11e? /usr/afs/etc/NetInfo contains the 
correct address (161.2.249.17).

_________________________________________________________________
It's fast, it's easy and it's free. Get MSN Messenger today! 
http://www.msn.co.uk/messenger