[OpenAFS] pre-fetch cache

Brian May bam@snoopy.apana.org.au
Tue, 19 Jul 2005 22:58:16 +1000


>>>>> "Jeffrey" == Jeffrey Altman <jaltman@columbia.edu> writes:

    >> What would be nice is a special volume type that is a
    >> combination of the cache and replicated read-only volumes.  If
    >> the master copy of any file in this volume changes, then all
    >> replicated copies of this file are deleted, and the next
    >> request is forwarded to the master server. Write requests are
    >> forwarded to the master server. That way multiple clients at
    >> one site can "share" the one cache copy.

    Jeffrey> I don't understand how this would help share a cache.
    Jeffrey> The cache data would still have to be read over the
    Jeffrey> network.

The replicated copy could be installed on a fast LAN when the master
server is on a slower WAN.

This would help in the situation where a number of clients at one site
need to access data from a remote site.

I need to talk to my client more about their requirements - I suspect
in this case it would be better if the servers were distributed, not
centralized as was requested.

Still, I think the above would be a good feature for applications that
require distribution of large files across a large number of clients
across a small number of sites.

Caching on the clients is good, but it requires repeatedly downloading
each file on each computer at a site, which could be expensive if it
has to be downloaded from a WAN connection.

Examples of large files could include video files, CAD designs, etc.

This is simple if files are only changed at one site (use rsync + http
mirrors for example), it becomes more complicated if files can change
at any site.

    Jeffrey> Keep in mind that the Windows cache is now persistent.
    Jeffrey> You can cache up to 1.3GB on the client.  So you only
    Jeffrey> need to download the data once.  If you are planning on
    Jeffrey> reading the same data repeatedly across numerous reboots
    Jeffrey> you might not need to precache as much as you think.

The argument was "we don't want to load the network during peak
periods". I suspect not much thought was put into this.

Am I correct in my understanding of AFS that any file writes require
the entire file be copied back to the master server as soon as the
program calls close() on that file? I think my client forgot to
consider this.

The client somehow wanted read/write access, guaranteed access to
latest version, file locking when a file is opened (anywhere), and no
files to be transferred except at night.

Unfortunately, some of these conflicting requirements are simply not
possible without adding time travel capabilities (<grin>) to AFS. I
think time travel is beyond the budget of my client...

Also, my research says that the file locking won't work either,
because file locking would be required with Microsoft Office
applications, which use byte locking. My understanding is that OpenAFS
does not support byte locking and does not try to simulate it
(although doing so is on the TODO list).

I guess I should go back to the client to clarify requirements and
bring some sort of reality into the discussion. I don't know of
anything that meets their requirements, but I think AFS is the closest
I have seen.
-- 
Brian May <bam@snoopy.apana.org.au>