[OpenAFS] Crash testing OpenAFS

ted creedon tcreedon@easystreet.com
Mon, 15 Aug 2005 09:26:21 -0700


>It should be noted that this directory structure crashed the 1.3.84 Windows
client but does not crash the 1.3.87 Debug client.

tedc

-----Original Message-----
From: chas williams - CONTRACTOR [mailto:chas@cmf.nrl.navy.mil] 
Sent: Monday, August 15, 2005 8:55 AM
To: ted creedon
Cc: openafs-info@openafs.org
Subject: Re: [OpenAFS] Crash testing OpenAFS 

In message <20050815150504.4AAF3BF44@smtpauth.easystreet.com>,"ted creedon"
wri
tes:
>ftp://creedon.dhs.org/afs_stress_test/run0/
>ftp://creedon.dhs.org/afs_stress_test/run1

i recreated your test directory tree locally.  i am puzzled about a few
things though.  for instance:

	#!/bin/bash
	#set -x #if one is curious..
	dd if=/dev/zero of=1meg bs=256K count=1
	cp 1meg "./TESTDIR.TMP"
	cp 1meg "./ADAPTEC/ACMWrapperServer.A021.dll"
	cp 1meg "./ADAPTEC/ACMWrapperServer.A884.dll"
	cp 1meg "./ADAPTEC/CdCopier.A021.exe"

i would hazard that this is creating 256k files, not 1M files.

>You are correct. Filesize was reduced, script name was not changed.

the total volume size, after running ./mkdirs, ./mkfiles, ./mk1megfiles was
about 5.6G.  is this corect?

i was able to copy this tree from one volume to another on a different
server (within our local afs cell).  the servers are amd64_solaris10 running
openafs 1.3.81.

>Not a valid comparison. These are SUSE 9.3 Linux boxes. Intercell copying
is the point of the test.

the afs client machine which did the create and subsequent copy, was
i386_2.6.13-rc3 running openafs 1.3.87.
>Running client or client and server for a different cell?

>The problem seems to be when copying from a remote AFS volume on cell A to
a local AFS volume on cell B.

your tcpdump leads me to believe that atleast part of these tests is behind
a NAT.  is this true? 
>No NAT iptables -L shows no firewall 

further, the tcpdump from run1 looks incomplete.  the end of the dump still
seems to show data transfer.


the fstrace output from run0 is useless.  you need to install the afszcm.cat
in order to get something human readable.

cmdebug from run0 looks unremarkable.  the client doesnt not appear to be
wedged in anyway.

conclusions:  i would guess that the 1.3.87 openafs client is stable.
>But the combination of a client and server is not. That's the point.

perhaps you could trying building and running an older set of afs server
binaries, say 1.3.81.
>No way. This just confuses the issue.