[OpenAFS-devel] Regarding GSoC 2010 Collaborative Caching Project

shruti jain shruti.jain1988@gmail.com
Tue, 20 Apr 2010 23:50:56 +0530


--001485e8e67c7c7beb0484af234d
Content-Type: text/plain; charset=ISO-8859-1

Hi,

This looks good. I have understood what you intend to achieve in this
project. Thanks for the clarifications.

Shruti

On Sat, Apr 17, 2010 at 7:31 PM, Jeffrey Altman <
jaltman@secure-endpoints.com> wrote:

> On 4/17/2010 2:25 AM, shruti jain wrote:
> > Here is what I know about the cache manager and its file server
> > interactions.
> > The Cache Manager// resides on the client side in openAFS environment
> > and communicates with AFS file server on behalf of the application
> > programs running on the client. When an AFS file is needed by any
> > application program running on a client machine, the request is sent to
> > the Cache Manager which in turn issues RPC calls to the file server
> > storing the requested file.
>
> This is true for any object (file, directory, mount point, symlink, ...)
>
> AFS supports readonly replicas.  The CM is permitted to request copies
> of the data from any of the replicas although at present, the CM only
> reads from a single replica at a time.
>
> >// When the Cache Manager receives the
> > requested data from the file Server, it stores it in the cache and also
> > delivers it to the application program which had initially requested for
> > the data. In order to maintain cache consistency, server issues a
> > callback along with the data. A callback is a promise by a File Server
> > to a Cache Manager to inform any change in the data delivered by the
> > File Server to the Cache Manager. If any other client on the network
> > modifies the file then the file server breaks this callback and thus
> > gives an indication to the Cache manager that its locally cached copy of
> > the file is obsolete and needs to be updated.The callback mechanism
> > ensures that the Cache Manager always requests the most up-to-date
> > version of a file. In this way, cache manager also performs the
> > responsibility of maintaining the cache.
>
> You have the general idea.  Let me provide a few additional details.  In
> the original (and currently deployed) implementation of callbacks, a
> callback is a promise that the FS will notify the CM of a change for up
> to S seconds with values for read/write data typically measured in
> minutes and for read-only data typically measured in hours.  The number
> of callback promises (or registrations) that a FS can maintain is
> finite.  Callback registrations can therefore be canceled prematurely
> without there being a change.
>
> The callback notification (or invalidation) is delivered via an
> unauthenticated RPC channel.  As a result, the notification cannot be
> trusted by the CM and must be treated as meaning "a change might have
> occurred, please verify if it matters".
>
> The existing callback notification does not provide any hint as to the
> type of change that might have occurred.  Callback notifications are
> issued for many reasons including:
>
>  . the data changed
>  . the access control list changed
>  . other metadata changed
>  . the locking state changed
>  . the volume in which the data is located is being replicated
>   (aka released)
>  . the object has been deleted
>  . the FS ran out of room in the registration table
>
> Once a notification is issued, the registration is broken and the
> CM will receive no further notifications until it requests updated
> status for the object in question.
>
> The CM determines what has changed by issuing a FetchStatus RPC to
> the FS and comparing the prior and current status fields.
>
> Matt Benjamin has developed and implemented (but its not shipping yet)
> an extended version of callback notifications that provide the CM with
> additional details regarding the change.  When combined with an
> authenticated callback channel this becomes a very powerful combination.
>
> It is also important to discuss how the FS and CM track object data.
> Each time a change to the data (not the metadata) occurs, a data version
> (DV) number for the object is incremented.  When the CM issues a
> StoreData rpc, it is returned updated status info.  If the DV was
> incremented by one, then the CM knows that there was no race with
> another CM and all of the data in the cache for that file is still
> current.  If the DV increment was greater than one, then the CM knows
> that the data it just wrote is current, but all other data is suspect.
>
> When using the Extended Callback mechanism, the FS can issue a
> notification that a StoreData occurred affecting {FileID, offset,
> length} and the current DV is N without canceling the callback
> registration.  This permits the CM to maintain the cache coherency at a
> lower cost of network traffic when an object is actively being used.
>
> However, when a CM starts or when an object has been idle for more than
> a few minutes, there will be no callback registration.  In that
> situation, a change could have occurred to the file data and the CM will
> be forced to discard all of the cached data if a change did occur.
> Unfortunately, there is no mechanism at present for the CM to ask the FS
> "I need the chunk of data represented by {FileID, offset, length} but I
> currently have data in that range with the following hash value.  Could
> you confirm that my data is current or send me the correct data?"
>
> I have been considering a proposal to implement such an RPC,
> RXAFS_FetchDataWithHash(FID, offset, length, hash).  With such an RPC in
> place, the CM can verify the contents of the cache and avoid large
> amounts of unnecessary traffic.
>
> I am raising this idea here because I believe it is very applicable to
> your project.  The trust model in AFS is between the CM and the FS.
> There is no trust between CMs.  As a result, if a CM obtains data from
> another CM, it needs a low cost mechanism to validate it against the FS.
>
> > So in this project, we need to modify the cache manager to enable
> > interactions with other clients as well.
> > In the first part of the project, where the cache manager contacts a
> > fixed set of remote clients, it retrieves the file from any of these
> > clients if their callback of the file is not broken. Since the callback
> > is not broken, it is an indication that the file present on this remote
> > client is most recent. In case no client has most recent copy of the
> > file, we can contact the file server to retrieve the data.
>
> That is one approach but not the one I would take.  If the cost of
> reading the data from a local CM is so much cheaper than reading it from
> the FS, the CM can read the data from the other CM (or at least get its
> hash) and then verify it with the file server.
>
> In most file operations, the entire file is not re-written.  Just
> portions of it are and in the case of "append only files" such as log
> files, the data never changes after it is written.  Re-fetching this
> data from the FS every time the DV changes is extremely wasteful.  It is
> much better to obtain it in the cheapest mechanism possible and then
> verify it via a trusted means.
>
> > In the second part of the project, we can allow discovery of peer
> > clients for collaboration. This can be done by modifying the file server
> > to keep access logs of the clients and if a client requests for any data
> > then its corresponding clients in the logs would be returned to the
> > requesting clients. In order to maintain cache consistency, the
> > requesting client also establishes a callback guarantee from the file
> > server so that it knows of the modifications in the file irrespective of
> > where it has got the file from.
>
> I would leave the FS out of the peer collaboration and instead permit
> CMs that wish to offer data to do so via Bonjour.
>
> >
> > I have seen the files afs_callback.c, cbqueue.c, dcache.c and server.c
> > and think that these are some of the programs used in cache manager and
> > server-cache manager interactions. Please correct me if I am wrong.
>
> In terms of how I would like to see this project structured.  Before any
> collaboration is implemented I would like to see a generic mechanism
> added to the CM to permit use of a second level cache.  Then once than
> mechanism is in place, a plug-in to that framework can be implemented
> that supports obtaining data from the second level cache which happens
> to be peer CMs.
>
> The benefit of this approach is that the framework for the second level
> cache can be implemented and incorporated into a future openafs release
> without committing us to a particular implementation of the peer to peer
> protocols.  Future research in peer to peer cache sharing can then take
> place at a much lower cost.
>
> Jeffrey Altman
>
>
>
>

--001485e8e67c7c7beb0484af234d
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi,<br><br>This looks good. I have understood what you intend to achieve in=
 this project. Thanks for the clarifications.<br><br>Shruti<br><br><div cla=
ss=3D"gmail_quote">On Sat, Apr 17, 2010 at 7:31 PM, Jeffrey Altman <span di=
r=3D"ltr">&lt;<a href=3D"mailto:jaltman@secure-endpoints.com">jaltman@secur=
e-endpoints.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"border-left: 1px solid rgb(204, =
204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class=3D"im"=
>On 4/17/2010 2:25 AM, shruti jain wrote:<br>
&gt; Here is what I know about the cache manager and its file server<br>
&gt; interactions.<br>
&gt; The Cache Manager// resides on the client side in openAFS environment<=
br>
&gt; and communicates with AFS file server on behalf of the application<br>
&gt; programs running on the client. When an AFS file is needed by any<br>
&gt; application program running on a client machine, the request is sent t=
o<br>
</div>&gt; the Cache Manager which in turn issues RPC calls to the file ser=
ver<br>
&gt; storing the requested file.<br>
<br>
This is true for any object (file, directory, mount point, symlink, ...)<br=
>
<br>
AFS supports readonly replicas. =A0The CM is permitted to request copies<br=
>
of the data from any of the replicas although at present, the CM only<br>
reads from a single replica at a time.<br>
<br>
&gt;// When the Cache Manager receives the<br>
<div class=3D"im">&gt; requested data from the file Server, it stores it in=
 the cache and also<br>
&gt; delivers it to the application program which had initially requested f=
or<br>
&gt; the data. In order to maintain cache consistency, server issues a<br>
&gt; callback along with the data. A callback is a promise by a File Server=
<br>
&gt; to a Cache Manager to inform any change in the data delivered by the<b=
r>
&gt; File Server to the Cache Manager. If any other client on the network<b=
r>
&gt; modifies the file then the file server breaks this callback and thus<b=
r>
&gt; gives an indication to the Cache manager that its locally cached copy =
of<br>
&gt; the file is obsolete and needs to be updated.The callback mechanism<br=
>
&gt; ensures that the Cache Manager always requests the most up-to-date<br>
&gt; version of a file. In this way, cache manager also performs the<br>
</div>&gt; responsibility of maintaining the cache.<br>
<br>
You have the general idea. =A0Let me provide a few additional details. =A0I=
n<br>
the original (and currently deployed) implementation of callbacks, a<br>
callback is a promise that the FS will notify the CM of a change for up<br>
to S seconds with values for read/write data typically measured in<br>
minutes and for read-only data typically measured in hours. =A0The number<b=
r>
of callback promises (or registrations) that a FS can maintain is<br>
finite. =A0Callback registrations can therefore be canceled prematurely<br>
without there being a change.<br>
<br>
The callback notification (or invalidation) is delivered via an<br>
unauthenticated RPC channel. =A0As a result, the notification cannot be<br>
trusted by the CM and must be treated as meaning &quot;a change might have<=
br>
occurred, please verify if it matters&quot;.<br>
<br>
The existing callback notification does not provide any hint as to the<br>
type of change that might have occurred. =A0Callback notifications are<br>
issued for many reasons including:<br>
<br>
=A0. the data changed<br>
=A0. the access control list changed<br>
=A0. other metadata changed<br>
=A0. the locking state changed<br>
=A0. the volume in which the data is located is being replicated<br>
 =A0 (aka released)<br>
=A0. the object has been deleted<br>
=A0. the FS ran out of room in the registration table<br>
<br>
Once a notification is issued, the registration is broken and the<br>
CM will receive no further notifications until it requests updated<br>
status for the object in question.<br>
<br>
The CM determines what has changed by issuing a FetchStatus RPC to<br>
the FS and comparing the prior and current status fields.<br>
<br>
Matt Benjamin has developed and implemented (but its not shipping yet)<br>
an extended version of callback notifications that provide the CM with<br>
additional details regarding the change. =A0When combined with an<br>
authenticated callback channel this becomes a very powerful combination.<br=
>
<br>
It is also important to discuss how the FS and CM track object data.<br>
Each time a change to the data (not the metadata) occurs, a data version<br=
>
(DV) number for the object is incremented. =A0When the CM issues a<br>
StoreData rpc, it is returned updated status info. =A0If the DV was<br>
incremented by one, then the CM knows that there was no race with<br>
another CM and all of the data in the cache for that file is still<br>
current. =A0If the DV increment was greater than one, then the CM knows<br>
that the data it just wrote is current, but all other data is suspect.<br>
<br>
When using the Extended Callback mechanism, the FS can issue a<br>
notification that a StoreData occurred affecting {FileID, offset,<br>
length} and the current DV is N without canceling the callback<br>
registration. =A0This permits the CM to maintain the cache coherency at a<b=
r>
lower cost of network traffic when an object is actively being used.<br>
<br>
However, when a CM starts or when an object has been idle for more than<br>
a few minutes, there will be no callback registration. =A0In that<br>
situation, a change could have occurred to the file data and the CM will<br=
>
be forced to discard all of the cached data if a change did occur.<br>
Unfortunately, there is no mechanism at present for the CM to ask the FS<br=
>
&quot;I need the chunk of data represented by {FileID, offset, length} but =
I<br>
currently have data in that range with the following hash value. =A0Could<b=
r>
you confirm that my data is current or send me the correct data?&quot;<br>
<br>
I have been considering a proposal to implement such an RPC,<br>
RXAFS_FetchDataWithHash(FID, offset, length, hash). =A0With such an RPC in<=
br>
place, the CM can verify the contents of the cache and avoid large<br>
amounts of unnecessary traffic.<br>
<br>
I am raising this idea here because I believe it is very applicable to<br>
your project. =A0The trust model in AFS is between the CM and the FS.<br>
There is no trust between CMs. =A0As a result, if a CM obtains data from<br=
>
another CM, it needs a low cost mechanism to validate it against the FS.<br=
>
<div class=3D"im"><br>
&gt; So in this project, we need to modify the cache manager to enable<br>
&gt; interactions with other clients as well.<br>
&gt; In the first part of the project, where the cache manager contacts a<b=
r>
&gt; fixed set of remote clients, it retrieves the file from any of these<b=
r>
&gt; clients if their callback of the file is not broken. Since the callbac=
k<br>
&gt; is not broken, it is an indication that the file present on this remot=
e<br>
&gt; client is most recent. In case no client has most recent copy of the<b=
r>
&gt; file, we can contact the file server to retrieve the data.<br>
<br>
</div>That is one approach but not the one I would take. =A0If the cost of<=
br>
reading the data from a local CM is so much cheaper than reading it from<br=
>
the FS, the CM can read the data from the other CM (or at least get its<br>
hash) and then verify it with the file server.<br>
<br>
In most file operations, the entire file is not re-written. =A0Just<br>
portions of it are and in the case of &quot;append only files&quot; such as=
 log<br>
files, the data never changes after it is written. =A0Re-fetching this<br>
data from the FS every time the DV changes is extremely wasteful. =A0It is<=
br>
much better to obtain it in the cheapest mechanism possible and then<br>
verify it via a trusted means.<br>
<div class=3D"im"><br>
&gt; In the second part of the project, we can allow discovery of peer<br>
&gt; clients for collaboration. This can be done by modifying the file serv=
er<br>
&gt; to keep access logs of the clients and if a client requests for any da=
ta<br>
&gt; then its corresponding clients in the logs would be returned to the<br=
>
&gt; requesting clients. In order to maintain cache consistency, the<br>
&gt; requesting client also establishes a callback guarantee from the file<=
br>
&gt; server so that it knows of the modifications in the file irrespective =
of<br>
&gt; where it has got the file from.<br>
<br>
</div>I would leave the FS out of the peer collaboration and instead permit=
<br>
CMs that wish to offer data to do so via Bonjour.<br>
<div class=3D"im"><br>
&gt;<br>
&gt; I have seen the files afs_callback.c, cbqueue.c, dcache.c and server.c=
<br>
&gt; and think that these are some of the programs used in cache manager an=
d<br>
&gt; server-cache manager interactions. Please correct me if I am wrong.<br=
>
<br>
</div>In terms of how I would like to see this project structured. =A0Befor=
e any<br>
collaboration is implemented I would like to see a generic mechanism<br>
added to the CM to permit use of a second level cache. =A0Then once than<br=
>
mechanism is in place, a plug-in to that framework can be implemented<br>
that supports obtaining data from the second level cache which happens<br>
to be peer CMs.<br>
<br>
The benefit of this approach is that the framework for the second level<br>
cache can be implemented and incorporated into a future openafs release<br>
without committing us to a particular implementation of the peer to peer<br=
>
protocols. =A0Future research in peer to peer cache sharing can then take<b=
r>
place at a much lower cost.<br>
<font color=3D"#888888"><br>
Jeffrey Altman<br>
<br>
<br>
<br>
</font></blockquote></div><br>

--001485e8e67c7c7beb0484af234d--