[OpenAFS-devel] OpenAFS server 1.3.80 on x86_64

Ulrich Schwickerath ulrich.schwickerath@iwr.fzk.de
Sat, 2 Apr 2005 20:41:17 +0200


Hi, 

thank's a lot fo rthe answer. Here is the output of afsd. Below I also 
attached the ouput of ps after starting afsd. If the server is not running at 
all, things look similar (did not check in detail the ouput, but I can do if 
needed) but then afsd exits with an error 22 (cannot mount /afs). I'm using 
heimdal instead of kaserver which seems to work fine as can be seen from the 
output that I attached a bit further below. If I try to remove the afs 
module, the machine usually panics.

Thank's,
Ulrich

[root@iwrcgop028 root]# cd /usr/vice/etc
[root@iwrcgop028 etc]# ./afsd -memcache -verbose -nosettime -debug -stat 2000 
-dcache 800 -daemons 3 -volumes 70
afsd: My home cell is 'fzk.de'
ParseCacheInfoFile: Opening cache info file '/usr/vice/etc/cacheinfo'...
ParseCacheInfoFile: Cache info file successfully parsed:
        cacheMountDir: '/afs'
        cacheBaseDir: '/usr/vice/cache'
        cacheBlocks: 100000
afsd: 800 inode_for_V entries at 0x5498b0, 3200 bytes
SScall(183, 28, 17)=0 afsd: Forking rx listener daemon.
afsd: Forking rx callback listener.
SScall(183, 28, 48)=0 afsd: Forking rxevent daemon.
SScall(183, 28, 19)=0 SScall(183, 28, 36)=0 afsd: Calling AFSOP_CACHEINIT: 
2000 stat cache entries, 800 optimum cache files,6553600 blocks in the cache, 
flags = 0x1, dcache entries 800
SScall(183, 28, 6)=0 afsd: Sweeping workstation's AFS cache directory.
afsd: Using memory cache, not swept
SScall(183, 28, 0)=0 afsd: Calling AFSOP_CACHEINFO: dcache file is 
'/CacheItems'
afsd: Calling AFSOP_CELLINFO: cell info file is '/CellItems'
SScall(183, 28, 34)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28,29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183,28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 
SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 
28, 29)=0 SScall(183, 28, 29)=0 SScall(183, 28, 35)=0 afsd: Forking AFS 
daemon.
afsd: Forking Check Server Daemon.
afsd: Forking 3 background daemons.
SScall(183, 28, 1)=0 SScall(183, 28, 4)=0 afsd: Calling AFSOP_VOLUMEINFO: 
volume info file is '/VolumeItems'
afsd: Calling AFSOP_AFSLOG: volume info file is '/usr/vice/etc/AFSLog'
afsd: Calling AFSOP_CACHEINODE for each of the 800 files in ''
afsd: Calling AFSOP_GO with cacheSetTime = 0
SScall(183, 28, 100)=0 afsd: All AFS daemons started.
afsd: Forking trunc-cache daemon.
afsd: Mounting the AFS root on '/afs', flags: 0.
SScall(183, 28, 3)=0 SScall(183, 28, 2)=0 SScall(183, 28, 2)=0 SScall(183, 28, 
2)=0




[root@iwrcgop028 root]# ps xua | grep afsd
root      4722  0.0  0.0  4984  772 pts/0    S    20:23   0:00 ./afsd 
-memcache -verbose -nosettime -debug -stat 2000 -dcache 800 -daemons 3 
-volumes 70
root      4723  0.0  0.0     0    0 pts/0    Z<   20:23   0:00 [afsd 
<defunct>]
root      4724  0.0  0.0     0    0 pts/0    Z    20:23   0:00 [afsd 
<defunct>]
root      4727  0.0  0.0     0    0 pts/0    Z<   20:23   0:00 [afsd 
<defunct>]
root      4729  0.0  0.0     0    0 pts/0    Z    20:23   0:00 [afsd 
<defunct>]
root      4730  0.0  0.0     0    0 ?        SW   20:23   0:00 [afsd]
root      4731  0.0  0.0     0    0 pts/0    Z    20:23   0:00 [afsd 
<defunct>]
root      4733  0.0  0.0     0    0 pts/0    Z    20:23   0:00 [afsd 
<defunct>]
root      4735  0.0  0.0     0    0 pts/0    Z    20:23   0:00 [afsd 
<defunct>]
root      4737  0.0  0.0     0    0 pts/0    Z    20:23   0:00 [afsd 
<defunct>]
root      4739  0.0  0.0     0    0 pts/0    Z    20:23   0:00 [afsd 
<defunct>]
root      4817  0.0  0.0 36180  712 pts/1    S    20:26   0:00 grep afsd


[root@iwrcgop028 root]# bos status iwrcgop028.fzk.de
bos: no such entry (getting tickets)
bos: running unauthenticated
Instance buserver, currently running normally.
Instance ptserver, currently running normally.
Instance vlserver, currently running normally.
Instance fs, currently running normally.
    Auxiliary status is: file server running.
Instance upserver, currently running normally.
Instance kdc, currently running normally.
Instance kadmind, currently running normally.
Instance kpasswdd, currently running normally.

[root@iwrcgop028 root]# /usr/heimdal/bin/kinit admin
admin@FZK.DE's Password:
kinit: NOTICE: ticket renewable lifetime is 1 week
[root@iwrcgop028 root]# tokens

Tokens held by the Cache Manager:

Tokens for afs@fzk.de [Expires Apr  3 21:54]
   --End of list--

[root@iwrcgop028 root]# bos status iwrcgop028.fzk.de
Instance buserver, currently running normally.
Instance ptserver, currently running normally.
Instance vlserver, currently running normally.
Instance fs, currently running normally.
    Auxiliary status is: file server running.
Instance upserver, currently running normally.
Instance kdc, currently running normally.
Instance kadmind, currently running normally.
Instance kpasswdd, currently running normally.







On Friday 01 April 2005 23:44, Horst Birthelmer wrote:
> On Apr 1, 2005, at 10:47 PM, Ulrich Schwickerath wrote:
> > Hi,
> > I'm trying to setup a new AFS server using openafs 1.3.80 on a dual
> > Opteron
> > node running Scientific Linux 3.03/3.04 (recompiled RedHat REL). The
> > kernel
> > version is
> > 2.4.21-27.0.2.EL.XFSsmp
> > (including a patch for the XFS file system). I'm already running
> > another
> > Server (in a different cell) which runs on a dual Xeon box with Redhat
> > Linux,
> > and which works just fine.
> > What happens is the following:
> > 1/ running the client only on the Opteron system, I can connect to my
> > already
> > existing cell without problems (setting CellServDB and ThisCell
> > appropriatly)
> > 2/ I can start all server processes on the Opteron for the new cell.
> > Everything looks normal: bos tells that all processes are running (and
> > they
> > are). Authentication works fine, I created an admin account for which
> > I can
> > get a valid  token with which I can run bos in authenticated mode,
> > just as it
> > should be.
> > 3/ if I update /usr/vice/etc/CellServDB and ThisCell for my new Cell
> > on the
> > opteron system, and try to start the client, it tells me that all
> > daemons
> > have been started, and gets stuck immediately afterwards. The module
> > gets
> > loaded, but the forked afsd processes go into state Zombie and block,
> > cannot
> > be killed any more, and if I try to remove the module, the machine
> > panics.
> > The effect is the same if I run the client on a different machine than
> > the
> > server, so I assume the problem must be somewhere at the server site.
> >
> > I checked the lists but did not see a report of something like that
> > recently.
> > Did anybody succeed to run a (developement) server on x86_64 ?
> > Any idea what is wrong ?
>
> Could you start afsd with all your usual options but add "-verbose
> -debug"??
> So we can see where it gets stuck.
>
> Horst

-- 
__________________________________________
Dr. Ulrich Schwickerath
Forschungszentrum Karlsruhe
GRID-Computing and e-Science
Institut for Scientific Computing (IWR)
P.O. Box 36 40
76021 Karlsruhe, Germany

Tel: +49(7247)82-8607
Fax: +49(7247)82-4972 

e-mail: ulrich.schwickerath@iwr.fzk.de
PGP DH/DSS Key: ID 0xCEB9826F
Fingerprint: 5537 8473 CD26 507E 8EE2  BAAF 98E2 FD16 CEB9 826F
__________________________________________