[OpenAFS-devel] Coredumps on Irix 6.5.15 with openafs-1.2.7

Martin MOKREJŠ mmokrejs@natur.cuni.cz
Wed, 11 Dec 2002 14:08:32 +0100 (CET)


On Tue, 10 Dec 2002, Derrick J Brashear wrote:

> On Tue, 10 Dec 2002, [iso-8859-2] Martin MOKREJŠ wrote:
>
> > Hi,
> >   so I've upgraded from IBM AFS 2.32 to openafs-1.2.7 official binaries
> > and have couple of problems:
> >
> > 1. bosserver and others services don't start as kernel report
> >
> > Dec 10 17:11:41 1A:xx unix: |$(0x6da)ALERT: Process [vlserver] pid 1170 killed: process or stack limit exceeded
> > Dec 10 17:11:47 1A:xx unix: |$(0x6da)ALERT: Process [ptserver] pid 1173 killed: process or stack limit exceeded
> >
> > What makes them so memory hungry? I did not change kernel settings at all?
> > Would someone have a look into the changes between these version for what
> > might be the of this?
>
> The default "limit"s are probably too small. Did you run it from a shell
> or from an rc script? The latter probably is unlimited while the former is

No, from rc script.

> probably limited. What does running:
> limit
> offer?

I've added few lines to the startup script to modify and catch the current
limits after change:

Starting AFS. Current limits are:
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         4000000
stack(kbytes)        500000
memory(kbytes)       150768
coredump(blocks)     unlimited
nofiles(descriptors) 200
vmemory(kbytes)      1048576
concurrency(threads) 1024

New limits are:
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        unlimited
memory(kbytes)       unlimited
coredump(blocks)     unlimited
nofiles(descriptors) unlimited
vmemory(kbytes)      unlimited
concurrency(threads) 1024

I've increased the kernel setting by appending to /var/sysgen/stune

rlimit_stack_cur        = 512000000 ll
rlimit_data_cur = 4096000000 ll
rlimit_pthread_max      = 2048 ll
rlimit_stack_max        = 1024000000 ll
rlimit_data_max = 4096000000 ll
nproc   = 4096

With the changes above I have:

Dec 11 14:05:49 1A:xx unix: ALERT: ptserver [2983] - out of logical swap space during stack growth - see swap(1M)
Dec 11 14:05:49 1A:xx unix: ALERT: ptserver [2983] - out of logical swap space during stack growth - see swap(1M)
Dec 11 14:05:49 1A:xx unix: |$(0x6dc)ALERT: Process [ptserver] pid 2983 killed: not enough memory to grow stack
Dec 11 14:05:49 1A:xx unix: ALERT: vlserver [2998] - out of logical swap space during stack growth - see swap(1M)
Dec 11 14:05:49 1A:xx unix: ALERT: vlserver [2998] - out of logical swap space during stack growth - see swap(1M)


The machine has 196MB RAM,
# swap -s
total: 0.00k allocated + 74.03m add'l reserved = 74.03m bytes used, 181.48m bytes available
#

Of course I can add swap, but the question is why does Openafs demand so
much? IBM AFS did NOT!


# ps -lfe
  F S      UID        PID       PPID  C PRI NI  P    SZ:RSS      WCHAN    STIME TTY     TIME CMD
  4 S     root        347          1  0 199 RT  *     0:0     8ed97160 12:54:27 ?       0:00 /usr/vice/etc/afsd -nosettime -stat
  4 S     root        348          1  0  68 RT  *     0:0     8e84f554 12:54:27 ?       0:00 /usr/vice/etc/afsd -nosettime -stat
  4 S     root        349          1  0 199 RT  *     0:0     8ed97810 12:54:27 ?       0:00 /usr/vice/etc/afsd -nosettime -stat
  4 S     root        350          1  0  68 RT  *     0:0     8ed97810 12:54:27 ?       0:00 /usr/vice/etc/afsd -nosettime -stat
  4 S     root        351          1  0  68 RT  *     0:0     8ed97810 12:54:27 ?       0:00 /usr/vice/etc/afsd -nosettime -stat
  4 S     root        352          1  0  68 RT  *     0:0     8ed97d10 12:54:27 ?       0:00 /usr/vice/etc/afsd -nosettime -stat
  4 S     root        353          1  0  68 RT  *     0:0     8ed97d10 12:54:27 ?       0:00 /usr/vice/etc/afsd -nosettime -stat
  4 S     root        354          1  0  68 RT  *     0:0     8ed97d10 12:54:27 ?       0:00 /usr/vice/etc/afsd -nosettime -stat
  4 S     root        358          1  0  68 RT  *     0:0     8ed97e10 12:54:35 ?       0:00 /usr/vice/etc/afsd -nosettime -stat
  4 S     root        639          1  0  20 20  *  2436:1351  8cbfb560 12:56:47 ?       0:03 /usr/afs/bin/fileserver
  4 S     root        640          1  0  20 20  *   857:555   8fedd0a0 12:56:47 ?       0:00 /usr/afs/bin/volserver

I guess the fileserver is the problem. But, the real problem is, that not even bosserver is running,
or better to say, I get:

# bos status -server xx -long
bos help stop
bos: failed to contact host's bosserver (communications failure (-1)).
#

> > Tue Dec 10 17:09:55 2002 XFS/EFS File server starting
> > Tue Dec 10 17:11:06 2002 VL_RegisterAddrs rpc failed; will retry periodically (code=5376, err=2)
> > Tue Dec 10 17:11:07 2002 Partition /vicepa: XFS inodes too small, exiting.
> > Tue Dec 10 17:11:07 2002 Run xfs_size_check utility and remake partitions.
>
> That's a new one on me. I don't know enough about XFS offhand to comment.
>
> > PtLog
> > Tue Dec 10 17:11:41 2002 Using 195.113.59.111 as my primary address
>
> That's good.
>
> > Why does /salvager care about /vicepa, when I asked it to look only
> > at /vicepb?
>
> When fileserver died, it tried to salvage. unmount /vicepa if you don't
> want it considered.
>
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel
>

-- 
Martin Mokrejs <mmokrejs@natur.cuni.cz>, <m.mokrejs@gsf.de>
PGP5.0i key is at http://www.natur.cuni.cz/~mmokrejs
MIPS / Institute for Bioinformatics <http://mips.gsf.de>
GSF - National Research Center for Environment and Health
Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany
tel.: +49-89-3187 3683 , fax: +49-89-3187 3585