[OpenAFS] FS exited on signal 6
Jeffrey Hutzelman
jhutz@cmu.edu
Mon, 11 Oct 2004 10:42:55 -0400
On Sunday, October 10, 2004 23:11:51 -0400 Derrick J Brashear
<shadow@dementia.org> wrote:
> On Mon, 11 Oct 2004, Matthew Cocker wrote:
>> And why is a signal 6 entry in boslog followed by automatic salvage and
>> restart of the FS
Signal 6 is SIGABRT. A process that receives this signal dies, unless it
has made special provisions to do otherwise (which would be pointless in
this case, since the reason for the SIGABRT is that some internal
inconsistency was detected). Under normal circumstances, such a process
will also drop a core file; however, a resource limit may be in effect
which prevents that.
The automatic salvage happens because abnormal termination of the
fileserver always results in an automatic salvage. This is because after
an abnormal termination, things may be in an inconsistent state. It is
equivalent to automatically running fsck on a filesystem that was not
unmounted cleanly.
The automatic restart happens because that's what the bosserver is for.
>> , nearly always followed within 24 hours with another
>> total lock up where as manually started salvages seem to keep things
>> happier for longer?
>
> luck?
Yes, this is likely entirely luck, or else the correlation is not a strong
as is suggested. It is pretty likely that running the salvager is not
relevant, and it's actually the fileserver restart that has an effect
(remember, a whole-partition or whole-server salvage requires shutting down
the fileserver for the duration).
-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
Sr. Research Systems Programmer
School of Computer Science - Research Computing Facility
Carnegie Mellon University - Pittsburgh, PA