[OpenAFS] Re: Advice on a use case

Andrew Deason adeason@sinenomine.net
Tue, 6 Nov 2012 10:49:33 -0600

On Tue, 6 Nov 2012 00:06:53 -0800
Timothy Balcer <timothy@telmate.com> wrote:

> I have a need to think about replicating large volumes (multigigabyte)
> of large number (many terabytes of data total), to at least two other
> servers besides the read write volume, and to perform these releases
> relatively frequently (much more than once a day, preferably)

How much more frequently? Hourly? Some people do 4 times hourly (and
maybe more) successfully.

> Also, these other two (or more) read-only volumes for each read write
> volume will be remote volumes, transiting across relatively fat, but
> less than gigabit, pipes (100+ megabits)

Latency may matter more than bandwidth; do you know what it is?

> For the moment what I have decided to experiment with is a simple
> system.  My initial idea is to work the afs read-only volume tree into
> an AUFS union, with a local read write partition in the mix. This way,
> writes will be local, but I can periodically "flush" writes to the AFS
> tree, double check they have been written and released, and then
> remove them from the local partition.. this should maintain integrity
> and high availability for the up-to-the-moment recordings, given I
> RAID the local volume. Obviously, this still introduces a single point
> of failure... so I'd like to flush as frequently as possible.
> Incidentally, it seems you can NFS export such a union system fairly
> simply.

I'm not sure I understand the purpose of this; are you trying to write
new data from all of the 'remote' locations, and you need those writes
to 'finish' quickly?

> But, I feel as if I am missing something... it has become clear that
> releasing is a pretty intensive operation, and if we're talking about
> multiple gigabytes per release, I can imagine it being extremely
> difficult.  Is there a schema that i can use with OpenAFS that will
> help alleviate this problem? Or perhaps another approach I am missing
> that may solve it better?

Eh, some people do that; it just reduces the benefit of the client-side
caching. Every time you release a volume, the server tells clients that
for all data in that volume, the client needs to check with the server
to see if the cached data is different from what's actually in the
volume. But that may not matter so much, especially for a small number
of large files.

To improve things, you can maybe try to reduce the number of volumes
that are changing. That is, if you are adding new data in batches, I
don't know if it's feasible for you to add that 'batch' of data by
creating a new volume instead of writing to existing volumes.

And, of course, the release process may not be fast enough to actually
do releases as quickly as you want. There are maybe some ways to ship
around volume dumps yourself to get around that, and some pending
improvements to the volserver that would help, but I would only think
about that after you try the releases yourself.

Andrew Deason