[OpenAFS-devel] Coredumps on Irix 6.5.15 with openafs-1.2.7

Martin MOKREJŠ mmokrejs@natur.cuni.cz
Tue, 10 Dec 2002 17:35:57 +0100 (CET)


Hi,
  so I've upgraded from IBM AFS 2.32 to openafs-1.2.7 official binaries
and have couple of problems:

1. bosserver and others services don't start as kernel report

Dec 10 17:11:41 1A:xx unix: |$(0x6da)ALERT: Process [vlserver] pid 1170 killed: process or stack limit exceeded
Dec 10 17:11:47 1A:xx unix: |$(0x6da)ALERT: Process [ptserver] pid 1173 killed: process or stack limit exceeded

What makes them so memory hungry? I did not change kernel settings at all?
Would someone have a look into the changes between these version for what
might be the of this?


2. coredumps in /usr/afs/logs/

-rw-r--r--    1 root     sys        929792 Dec 10 17:11 core
-rw-r--r--    1 root     sys       4344280 Apr 24  2002 corefile.fs
-rw-r--r--    1 root     sys        929792 Dec 10 16:57 coreptserver
-rw-r--r--    1 root     sys       1130496 Dec 10 16:57 corevlserver
-rw-r--r--    1 root     sys       1380352 May  3  2002 corevol.fs

All the corefiles turned to be useless for retrieving the stack trace.
It seems the binaries are stripped or to much optimized. In one case,
I've got:

# dbx /usr/afs/bin/fileserver corefile.fs
dbx version 7.3.1 68542_Oct26 MR Oct 26 2000 17:50:34
Elf 32 File Header in core file does not match executable/dso /usr/afs/bin/fileserver (elf header e_entry mismatch) (use of the core file may be misleading!)
Unable to correlate regions with rld, object list address is 0x0
Unable to correlate regions with rld object list:( dbx internal status code 7).Allowing some minimal use of the core file, butdbx will work poorly (if at all)
Core file does not correspond to executable
Executable /usr/afs/bin/fileserver
(dbx) where

process must be stopped
(dbx) quit


BosLog
Tue Dec 10 16:57:32 2002: vlserver exited on signal 11 (core dumped)
Tue Dec 10 16:57:33 2002: ptserver exited on signal 11 (core dumped)
Tue Dec 10 16:57:33 2002: vlserver exited on signal 11 (core dumped)
Tue Dec 10 16:57:34 2002: ptserver exited on signal 11 (core dumped)
Tue Dec 10 16:57:35 2002: vlserver exited on signal 11 (core dumped)
Tue Dec 10 16:57:36 2002: ptserver exited on signal 11 (core dumped)
Tue Dec 10 16:57:36 2002: vlserver exited on signal 11 (core dumped)
Tue Dec 10 16:57:37 2002: ptserver exited on signal 11 (core dumped)
Tue Dec 10 16:57:37 2002: vlserver exited on signal 11 (core dumped)
Tue Dec 10 16:57:38 2002: ptserver exited on signal 11 (core dumped)
Tue Dec 10 16:57:39 2002: vlserver exited on signal 11 (core dumped)
Tue Dec 10 16:57:39 2002: BNODE 'vlserver' repeatedly failed to start, perhaps missing executable.
Tue Dec 10 16:57:39 2002: ptserver exited on signal 11 (core dumped)
Tue Dec 10 16:57:39 2002: BNODE 'ptserver' repeatedly failed to start, perhaps missing executable.
Tue Dec 10 16:57:40 2002: vlserver exited on signal 11 (core dumped)
Tue Dec 10 16:57:40 2002: BNODE 'vlserver' repeatedly failed to start, perhaps missing executable.
Tue Dec 10 16:57:40 2002: ptserver exited on signal 11 (core dumped)
Tue Dec 10 16:57:40 2002: BNODE 'ptserver' repeatedly failed to start, perhaps missing executable.
Tue Dec 10 16:59:37 2002: fs:salv exited with code 0
Tue Dec 10 17:01:20 2002: fs:file exited with code 1
Tue Dec 10 17:01:20 2002: fs:vol exited on signal 15
Tue Dec 10 17:03:17 2002: fs:salv exited with code 0
Tue Dec 10 17:05:04 2002: fs:file exited with code 1
Tue Dec 10 17:05:05 2002: fs:vol exited on signal 15
Tue Dec 10 17:06:51 2002: fs:salv exited with code 0
Tue Dec 10 17:08:03 2002: fs:file exited with code 1
Tue Dec 10 17:08:03 2002: fs:vol exited on signal 15
Tue Dec 10 17:09:54 2002: fs:salv exited with code 0

FileLog
Tue Dec 10 17:09:55 2002 XFS/EFS File server starting
Tue Dec 10 17:11:06 2002 VL_RegisterAddrs rpc failed; will retry periodically (code=5376, err=2)
Tue Dec 10 17:11:07 2002 Partition /vicepa: XFS inodes too small, exiting.
Tue Dec 10 17:11:07 2002 Run xfs_size_check utility and remake partitions.
Tue Dec 10 17:11:07 2002 Shutting down: errors encountered initializing volume package
Tue Dec 10 17:11:07 2002 VShutdown:  shutting down on-line volumes...
Tue Dec 10 17:11:07 2002 VShutdown:  complete.

PtLog
Tue Dec 10 17:11:41 2002 Using 195.113.59.111 as my primary address

SalvageLog
@(#) OpenAFS 1.2.7 built  2002-09-25
12/10/2002 17:11:41 STARTING AFS SALVAGER 2.4 (/usr/afs/bin/salvager)
12/10/2002 17:11:41 Partition /vicepa: XFS inodes too small, exiting.
12/10/2002 17:11:41 Run xfs_size_check utility and remake partitions.
12/10/2002 17:11:41 Starting salvage of file system partition /vicepb
12/10/2002 17:11:41 SALVAGING FILE SYSTEM PARTITION /vicepb (device=dks0d5s7)
12/10/2002 17:13:31 Unable to successfully stat inode file for (null)
12/10/2002 112/10/2002 17:14:05 SALVAGING OF PARTITION /vicepb COMPLETED
/salvage.inodes.dks0d5s7.1163; Not salvaged dks0d5s7 ***
Increase space on partition or use '-tmpdir'

null filename during server startup?

But running interactively I got:
# /usr/afs/bin/salvager -nowrite -partition /vicepb -showlog -debug
@(#) OpenAFS 1.2.7 built  2002-09-25
12/10/2002 17:29:47 STARTING AFS SALVAGER 2.4 (/usr/afs/bin/salvager -nowrite -partition /vicepb -showlog -debug)
12/10/2002 17:29:47 Partition /vicepa: XFS inodes too small, exiting.
12/10/2002 17:29:47 Run xfs_size_check utility and remake partitions.
12/10/2002 17:29:47 SALVAGING FILE SYSTEM PARTITION /vicepb (device=dks0d5s7(READONLY mode))
12/10/2002 17:29:47 Removing old salvager temp files salvage.inodes.dks0d5s7.1273
12/10/2002 17:31:30 SALVAGING OF PARTITION /vicepb (READONLY mode) COMPLETED
#

Why does /salvager care about /vicepa, when I asked it to look only
at /vicepb?


SalvageLog.old
@(#) OpenAFS 1.2.7 built  2002-09-25
12/10/2002 17:11:07 STARTING AFS SALVAGER 2.4 (/usr/afs/bin/salvager)
12/10/2002 17:11:07 Partition /vicepa: XFS inodes too small, exiting.
12/10/2002 17:11:07 Run xfs_size_check utility and remake partitions.
12/10/2002 17:11:07 Starting salvage of file system partition /vicepb

VLLog
Tue Dec 10 17:11:41 2002 Using 195.113.59.111 as my primary address

VolserLog
Tue Dec 10 17:09:55 2002 Starting AFS Volserver 2.0 (/usr/afs/bin/volserver)



Is someone interrested if I'd build -g2 binaries and post the resolved
stack traces to the list?

-- 
Martin Mokrejs <mmokrejs@natur.cuni.cz>, <m.mokrejs@gsf.de>
PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany
tel.: +49-89-3187 3683 , fax: +49-89-3187 3585