[OpenAFS] Crash testing OpenAFS

ted creedon tcreedon@easystreet.com
Fri, 12 Aug 2005 20:42:06 -0700


I can make 1.3.87 crash at will.

Tried to cp two 1.2.11 volumes to the new file server and it hung after a
long while.

No interesting info in cmdebug or fstrace.

I'll send info tomorrow.

Need some prompting on tcpdump. 

Hardcore.

tedc

-----Original Message-----
From: chas williams - CONTRACTOR [mailto:chas@cmf.nrl.navy.mil] 
Sent: Friday, August 12, 2005 1:28 PM
To: ted creedon
Cc: openafs-info@openafs.org
Subject: Re: [OpenAFS] Crash testing OpenAFS 

>cp -rvp /afs/.bigcell/foo/* /afs/.home.ted-doris.fam/bar #doesn't work
>
>Where bigcell is 1.2.11 and home is 1.3.87

both servers running linux?  which kernel?  client is running linux?
which kernel?  which version of afs on the client?

>1. The transfer stops for no apparent reason after transferring for 
>quite a while. Looking at TOP - rxlistener, etc just disappear. The OS 
>and afs server/clients remained up.

after it stops, on the afs client, run: 

	cmdebug localhost 7001 -long

also, look in /usr/afs/logs on both servers.  in particular, FileLog and
VolserLog. 

you might also try running 'fstrace setset' (as root) after the cp stops
running properly, wait a minute or two, and then issue 'fstrace dump' and
see if anything is in there.

failing any of that, we will likely need to a tcpdump capture from the
client.  check the archives for proper options.

>3. It appears that the crashing is related to actual stress under load i.e.
>when the 24000 files contain actual data.

about how much data?

>4. cp -rvp /afs/.bigcell/foo/* /afs/.bigcell/bar #does seem to work.

how about:

cp -rvp /afs/.home.ted-doris.fam/foo/* /afs/.home.ted-doris.fam/bar