[OpenAFS] AFS Cache on Parallel File system

Thu, 7 Jul 2011 11:32:15 -0500

>From my understanding, in the CEPH/Gluster projects a gateway would be
a way to access the parallel file system without using the native
client.  This is actually not want I want.  My approach is instead to
layer AFS on top of a PFS such that the cache is store locally to the
whole cluster.

The idea is closest to the second extension in the list but differs
because there is no need for the cache managers to communicate (except
through shared files.) as the data is already present on all of the
systems.

Spenser

On Thu, Jul 7, 2011 at 7:53 AM, Jeffrey Altman
<jaltman@your-file-system.com> wrote:
> Spenser:
>
> The AFS cache cannot be used the way you are thinking. =A0What you are
> looking for is a ceph/gluster to afs gateway which does not exist.
>
> The AFS cache is part of the security boundary that is enforced by the
> Cache Manager in the kernel. =A0As such, it is stored either in kernel
> memory or on local disk accessible only to the one kernel. =A0It is not
> designed for shared access. =A0Pointing multiple AFS cache managers at th=
e
> same cache will most likely result in data corruption.
>
> There are two extensions to AFS that are being developed that will help
> cluster access to data stores from far away locations
>
> =A01. read/write replication which permits a single copy of the data
> generated at the slow site to be replicated to file servers near the
> cluster.
>
> =A02. peer-to-peer cache sharing which permits an AFS cache manager to
> securely access data from another cache manager on the same subnet and
> avoid retransmitting it across a slow link.
>
> The first option is preferred when it is possible to deploy file servers
> in the cluster data center because it doesn't involve adding workload to
> client nodes and provides for the possibility of parallel reads.
>
> Jeffrey Altman
>
>
> On 7/7/2011 4:01 AM, Spenser Gilliland wrote:
>> Hello,
>>
>> Can the AFS cache be placed on a parallel file system (IE: ceph or glust=
er)?
>>
>> If the cache can be placed on a parallel file system,
>> When data is read into or written to the cache will all of the other
>> nodes in the cluster have access to this cached data for both reading
>> and writing? =A0And will every write block until it is written to the
>> AFS cell (IE: is it write back or write-through)?
>>
>> FYI: I'm going to give this a go here in a couple weeks and wanted to
>> know if anyone has tried it.
>>
>> The idea is to have an AFS Cell at home (very slow especially upload)
>> and a cluster at School which accesses this AFS Cell but only
>> downloads a file once for all of the servers in the cluster thereby
>> saving time and bandwidth. =A0Additionally, because the file is now on
>> the parallel file system all nodes can access the data concurrently.
>> When the program is finished the results will be available in the same
>> directory as the program.
>>
>> I'm thinking this could be immensely valuable for grid computing; if it =
works.
>>
>> Let me know if there is anything I should be looking out for along the w=
ay.
>>
>> Thanks,
>> Spenser
>>
>>
>
>

--=20
Spenser Gilliland
Computer Engineer
Illinois Institute of Technology