[OpenAFS-devel] issues with crashing fileserver...

Neulinger, Nathan nneul@umr.edu
Thu, 18 Apr 2002 09:39:59 -0500


Yep, that's definately true.=20

I want to fix the file server problem causing the crash (which the lwp
fs will help with), but I actually consider it more important to fix the
problem where the fileserver doesn't terminate completely after a crash.
Recovery first, stability second.=20

That may sound a bit backwards, but you're always going to have crashes
no matter what software you use. Seems to me to be more important to be
able to recover quickly and completely. If you can do that, fixing the
actual crash cause isn't as life-or-death.=20

-- Nathan

------------------------------------------------------------
Nathan Neulinger                       EMail:  nneul@umr.edu
University of Missouri - Rolla         Phone: (573) 341-4841
Computing Services                       Fax: (573) 341-4216


> -----Original Message-----
> From: Matthew Andrews [mailto:mnandrews@lbl.gov]=20
> Sent: Thursday, April 18, 2002 9:33 AM
> To: Neulinger, Nathan
> Subject: Re: [OpenAFS-devel] issues with crashing fileserver...
>=20
>=20
> correct me if I'm wrong, but if what you're trying to debug=20
> is the BOS=20
> behavior, or in general the fact that only part of the=20
> fileserver dies,=20
> none of this behavior will occur in a lwp server given that there is=20
> only one process(kernel thread) and therefore no orphaned=20
> processes are=20
> possible.
>=20
> this may be able to help with the debugging of the crash=20
> itself, but I=20
> don't believe that it would be of any use for debugging the fact that=20
> child processes/pthreads are never reaped.
>=20
> -Matthew Andrews
>=20
> Nathan Neulinger wrote:
>=20
> >Derrick J Brashear wrote:
> >
> >>On Wed, 17 Apr 2002, Neulinger, Nathan wrote:
> >>
> >>>Notice - file dies once with a SEGV, then it's unable to=20
> get the server
> >>>sane again.
> >>>
> >>>Unfortunately, this is on linux, and there are no core=20
> dumps since it's
> >>>a threaded process.
> >>>
> >>install the lwp fileserver and get a core?
> >>
> >
> >Might just try that. I guess the build will be similar=20
> enough other than
> >the threading support...=20
> >
> >Is there much performance/etc. difference between the two?
> >
> >-- Nathan
> >
> >------------------------------------------------------------
> >Nathan Neulinger                       EMail:  nneul@umr.edu
> >University of Missouri - Rolla         Phone: (573) 341-4841
> >Computing Services                       Fax: (573) 341-4216
> >_______________________________________________
> >OpenAFS-devel mailing list
> >OpenAFS-devel@openafs.org
> >https://lists.openafs.org/mailman/listinfo/openafs-devel
> >
>=20
>=20
>=20
>=20