[OpenAFS] Crash testing OpenAFS
ted creedon
tcreedon@easystreet.com
Mon, 15 Aug 2005 10:36:34 -0700
General comment:
At the Workshop the MIT presenter stated that AFS worked fine until it was
"placed under load".
In the process of copying files from a 1.2.11 box to a 1.3.87 box I ran into
the same problem with Linux 2 Linux copying. That's how all this started.
tedc
-----Original Message-----
From: chas williams - CONTRACTOR [mailto:chas@cmf.nrl.navy.mil]
Sent: Monday, August 15, 2005 10:24 AM
To: ted creedon
Cc: 'Jeffrey Altman'; openafs-info@openafs.org
Subject: Re: [OpenAFS] Crash testing OpenAFS
In message <20050815171201.52EC0C62A@smtpauth.easystreet.com>,"ted creedon"
wri
tes:
>3. Copying the test set with empty files works fine. Files with data
>crashes the destination 1.3.87 Linux box.
by "destination" you are referring to the afs fileserver containing the
destination afs volume?
>>Yes.
>5. Crash means the Linux operating system crashes. Other xterm windows
>do not respond, the system won't soft reboot and usually wont respond to
ping.
>Hardware reset is required.
if my assertion is true, then you should not be running anything on the
console, this includes x-windows or any other pretty graphical gui. if the
server crashes, then you will be able to see the panic/oops unless the
machine wedges in which case we have to try something else.
try the simply things first though.
>>Can do. I'll keep at runlevel 5 but kill X.
you keep saying your cache gets corrupted. this leads me to think that your
afs client machine is crashing and not the afs file server.
>>I suspect the Linux AFS 1.3.87 client crashes on the 1.3.87 server but I
don't have the expertise to tell. The cache corruption may be caused by the
hard reset?
is the fileserver and client the same machine?
>>There are 2 client/fileservers a 1.2.11 and a 1.3.87. The 1.3.87 box is
the destination that crashes.
how about that network diagram?
>>You should have it by now.