[OpenAFS] openafs performance in an internet-scale traffic environment

Wed, 21 Feb 2007 20:01:36 +0200

Hi all,

I'm currently investigating several options for a back-end storage  
system for an internet application. One of the solutions I'm  
investigating is open-afs and I'm wondering if open-afs is situated  
to this type of application.

Some background: we have 4 front-end http servers (apache) serving up  
small (in the region of 50K) files. We have in the order of 1.5M to  
2M such files, and are currently doing around 4k - 5k requests per  
second. The number of writes per second is an order of magnitude  
lower. The servers have attached RAID for the back-end storage  
(commodity h/w). We partition our files (based on filename) and split  
them across the servers. Obviously this is not redundant.

One solution would be to use NAS devices (like netapp), but as soon  
as you start clustering these the price goes up dramatically. We  
would prefer to be able to add commodity servers to our network and  
not have to reconfigure NFS mount-points etc.

So we're looking at open-afs. We are thinking of a situation where we  
have a cell of 2+ servers on the back-end with our data spread over a  
number of volumes, that we can move around. This takes care of the  
scalability issue. For redundancy we would investigate the volume  
replication feature.

I've read the success stories on the website. However, most of these  
cases talk about scenarios where there are 100s of clients accessing  
an afs cell that is holding users' home directories. In our scenario  
we will have 5 to 10 (max) clients, but performing 5k - 10K  
operations across the whole cell per second.

My question is simple: has/can open-afs be used in a situation like  
this? If it has/can then I will go on with building a test platform  
and testing it in our environment. But before doing that I'd like to  
know if I'm on the right track.

Many thanks
Richard Human