[OpenAFS] AFS namei file servers, SAN, any issues elsewhere? We've had some. Can AFS _cause_ SAN issues?

Christof Hanke hanke@csc.fi
Mon, 17 Mar 2008 10:33:45 +0200


Kim Kimball wrote:
> SAN symptoms, for those interested
> 
> I'm seeing SCSI command timeouts and UFS log timeouts (on vice 
> partitions using the SAN for storage) on LUNS used for vicep's on the 
> Hitachi USP, and was seeing them also on the 9585 until a recent 
> configuration change.
> 
> At first I thought this was load related, so wrote scripts to generate a 
> goodly load.  It turns out that even with a one second sleep between 
> file create/write/close operations and between rm operations the SCSI 
> command timeouts still occur, and that it's not load but simply activity 
> that turns up the timeouts.
We recently purchased Hitachi UPS to virtualize LUNs from AMS-systems behind the UPS.
(actually for dCache as a SE in the LHC-project)
The results was a lot of SCSI-errors on the client site (linux and solaris).
Hitachi send some guys to CSC to fix that issue....
I wasn't involved in that fixing part, but IIRC the point was connected to the
caching of the UPS and the "fix" involved shortening of the SCSI-command-queue of the clients
(In your case the AFS-filesever).
I guess if you run something like iozone on one or more of the servers you'll find
the same SCSI-errors. I really doubt it is connected to AFS.
Contact me off-line if you need more detailed info or I can then put you in touch with the SAN-guys
here.


T/Christof