[OpenAFS] Problems in the last 2 days

Klaas Hagemann kerberos@northsailor.de
Wed, 29 Jan 2003 21:27:44 +0100


Derrick J Brashear schrieb:
> On Wed, 29 Jan 2003, Klaas Hagemann wrote:
> 
> Leaving the important point:
> 
> 
>>>It's going to be the same problem, malloc'ing yourself to death. Set a
>>>resource limit before starting the fileserver, and get a core from the LWP
>>>fileserver
> 
> 
> 
>>>>Another error was that the volserver stopped working but the fileserver 
>>>>were still running. So "vos examine >volume<" delievered a failure but 
>>>>the volume still was reachable.
> 
> 
>>>Did you get a core?
>>
>>No, from the last crash i just got a core for upclientetc. But i could 
>>not find something really interesting in it.
> 
> 
> ok, this is the volserver that we'd want a core from, here, but you resume
> talking about the fileservber below
> 
> 
>>I compiled the fileserver without the -O2 and with -g option and the 
>>ulimit for core was set to unlimited.
> 
> 
> Well, yes, but if as you say the machine crashed, no core. you have to
> limit the process memory use to less than "all the memory on the machine"
> so you get a core;-)

Ok, thats right.
But i watched "top" while i was getting the malloc. And the fileserver 
was not bigger than 12 MB and the volserver as well.
Btw: I had a closer look to the core and it was produced by the bosserver.

Just at the moment of the writing i got a corefile of the fileserver and 
the volserver. I had several messages in syslog (kernel: __alloc pages) 
and the volserver and fileserver were killed by signal 6.
But they were successfully restarted a few minutes later, the salvager 
repaired the filesystem and the system is back up and running.

I will have a closer look at the core dumps. Might the newest Version 
(1.2.7) help me out? I still use 1.2.6.


> 
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>