[OpenAFS-devel] Solaris10x86 & OpenAFS 1.3.81 thru 1.3.84

Robert Banz banz@umbc.edu
Thu, 30 Jun 2005 14:31:02 -0400


Christopher D. Clausen wrote:
> Robert Banz <banz@umbc.edu> wrote:
> 
>>It's a client-only, it's heavily utilized as a POP/IMAP server.  (we'd
>>like to move all of our machines in this "cluster" to sx86 from linux,
>>because, with the exeption of this unexplained problem, they perform
>>so much better...)
>>
>>-stat 8000 -daemons 6 -volumes 4096 -files 50000 -nosettime
>>
>>It's using a ufs-based cache.  I've tried it both in a logging and
>>nologging mode, and still get the same problem.
> 
> 
> How big of a disk cache?
> 
> Can you try -memcache to eliminate the disk cache as the problem?  (Of
> course, memcache might cause different problems...)

Cache size is apprx 1g.

Tried using a memcache earlier on in this process, and it tended to 
crash the machine quicker.  (Solaris sadly limits the max kernel memory 
allocation on Solaris 32bit to a pretty mediocre size)

> 
> Also, what POP/IMAP server software are you running?  Are you keeping
> mail stored in AFS?  I'd like to do the same and I can get a machine up
> with a similar config and see if I can replicate this problem.

We're running UW-IMAPD, with a variant of maildir in AFS for the backend 
storage.  'fs setcrypt' is off on this box.

It's a Dell 2650; the 'indication' that kernel memory is exausted 
happens when the Adaptec (cadp160) SCSI driver fails to allocate memory 
for DMA (or a few other variants on that same theme.)  Originally, the 
problem was thought to be in the SCSI driver, but indications from Sun 
seem to indicate that it's not.

> 
> And, I assume that knowing what the hardware is will help further
> isolate the problem.

What I said :)

Thanks, everybody, that's taking a peek at this!  I really hope I'm not 
throwing anyone on a wild goose chase.

-rob