[OpenAFS] OpenAFS with RAID
Paul Robins
paul@paulrobins.alldaypa.com
Wed, 28 Dec 2005 15:21:31 +0000
Jeffrey,
I appreciate your lengthy reply, you've confirmed many of the things
I was wondering about. The big issue when it comes to the server
situation is that a disk dying will infact kill the entire server as
these are low budget whiteboxes with basic SATA controllers, nothing
particularly impressive.
From John Hascall's post i am extremely interested in using DRBD to
effectively distribute any filesystem updates, this seems a more
appropriate solution for my needs, because unfortunately I don't have
access to 'proper' servers, and the Linux support for the SATA
controller on these motherboards (yes, i know, embedded controllers are
satan) is extremely poor.
Many thanks for taking the time to help me. I believe I may even attempt
to combine DRBD with AFS because we will shortly be opening a second
staffed site, meaning i will require some form of 'Global Filesystem' if
you will (no implication of GFS).
Thanks again,
Paul
Jeffrey Altman wrote:
> Paul Robins wrote:
>
>
>>Well that's what i was originally wondering, can AFS provide the ability
>>to replicate the contents of one fileserver to others which can be used
>>redundantly. It appears not at all; I'd still like to use AFS but I do
>>think i'm going to have to go NFS and then some sort of faux raid 1 for
>>redundancy.
>
>
> Paul:
>
> The real question you have to answer is what risks are you concerned
> about? What is the likelihood that you are going to lose an entire
> server without warning in such a manner that it makes a difference to
> the clients that would be communicating with it?
>
> The reason I specify "without warning" is that AFS far surpasses the
> capabilities of other file systems in the area of volume management.
> You said earlier in the thread that your biggest fear was losing a
> disk. So we can make that your warning sign. For each file server
> you deploy use mirrored disks (RAID-1) on which each disk is on its
> own interface card. Then deploy your file servers and leave enough
> empty space on each of the servers such that if necessary you can
> move all of the volumes on any one server to any of the other servers.
>
> Now if a disk ever fails the operation of the file server will be
> uninterrupted. You can then initiate volume moves of the
> non-replicated read-write volumes to other servers. These moves can
> be performed while the clients are actively using them. The clients
> will continue using the source server until the move is almost complete,
> there will be a brief busy state where the client waits, and then a
> moved notification which the client responds to by looking up the new
> location and continuing where it left off on the new server.
>
> Once all of the volumes have been moved off the server, you can take
> the server down and replace the disk or perform whatever form of
> maintenance that is required.
>
> In the recent past I have seen more outages caused for end users by
> a need to reconfigure non-Andrew file systems either for volume
> redistribution or physical maintenance than I have for physical failures
> in AFS deployments. AFS volume management allows you to perform more
> frequent maintenance of the hardware and the OS without impacting
> end users then other models.
>
> While a network based RAID-5 is a fine idea, the performance is really
> going to be quite poor from the perspective of end users even when the
> machines are physically quite close. Network RAIDs have the potential
> to provide redundancy when whole portions of the network infrastructure
> are lost. However, they do so at a significant cost in performance.
>
> Jeffrey Altman
>