[AFS3-std] updated callback draft (d6)

Matt Benjamin matt@linuxbox.com
Sun, 01 Jun 2008 20:05:29 -0400


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

This draft differs from the previous in two (material) respects:

1. potential performance issue in directory change notifications
(mandate to send callback per change, where traditionally AFS would have
sent one callback in the common case

2. ExtendedCallback proc extended to return a variable length array of
variant structures--intended to be a 0-length array of NoResult
structures in this draft (this is expected to be used by future
interface extensions, such as locking)

The solution to #1 is callback coalescing.  I've endeavored to clarify
why the mechanism is needed, and better explain how it is intended to wor=
k.

Matt

*******************


AFS Callback Extensions

Matt Benjamin <matt@linuxbox.com>

06/01/2008

Status of this Memo

This document specifies a standards track protocol extension for
the OpenAFS community, and requests discussion and suggestions
for improvements.

Key Words

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
Internet Engineering Task Force RFC 2119.

Abstract

AFS cache-control strategy is callback (invalidate) based. The
AFS callback design allows a client to know when an object it has
cached is no longer consistent, but the callback notification
message itself provides no specific information about the
triggering event. This is a protocol inefficiency, as in several
scenarios it results in unnecessary round-trips to file servers
to verify file status information, file access information, or to
fetch file data which has not changed. We propose an extension of
the callback mechanism to provide information about the event(s)
triggering a callback, in the payload of the callback
notification message itself. The proposed mechanism eliminates
most or all unnecessary round-trips imposed by the current
callback mechanism, and simultaneously allows AFS implementations
to (efficiently) provide correct semantics in several scenarios
involving multiple writers (ie, where AFS currently provides
incorrect semantics).

Table of Contents

Table of Contents

Status of this Memo
Key Words
Abstract
Table of Contents
~    1 Introduction
~    2 The AFS Callback Mechanism
~        2.1 Description
~        2.2 Analysis
~    3 Extended Callback Interface
~        3.1 Backward Compatibility
~        3.2 Interface Changes
~            3.2.1 Procedures
~            3.2.2 Constants
~            3.2.3 Data Types
~                AFSExtendedCallback
~                AFSCBFileStatus
~                AFSCBDirStatus
~                AFSCB_NotificationData
~        3.3 Semantic Changes
~            3.3.1 DataVersion Rule
~        3.4 Callback Invocations
~            3.4.1 AFSExtendedCallback
~                Flags
~                ExtraFlags
~                DataVersion
~                ExpirationTime
~                Data
~            3.4.2 Reasons for Cancellation
~                AFSCB_Cancel_Shutdown
~                AFSCB_Cancel_CallbackGC
~                AFSCB_Cancel_VolumeOffline
~                AFSCB_Cancel_VolumeMoved
~                AFSCB_Cancel_LostMyMind
~                AFSCB_Cancel_IHateYou
~            3.4.3 ExtendedCallback Procedure
~            3.4.4 Asynchronous Delivery
~            3.4.5 CallBack Coalescing
~                Call Consolidation
~                Coalescing of Equivalent Notifications
~            3.4.6 AFSCB_Event_StoreData
~            3.4.7 AFSCB_Event_StoreACL
~            3.4.8 AFSCB_Event_StoreStatus
~            3.4.9 AFSCB_Event_CreateFile
~            3.4.10 AFSCB_Event_MakeDir
~            3.4.11 AFSCB_Data_Symlink
~            3.4.12 AFSCB_Event_Link
~            3.4.13 AFSCB_Event_RemoveFile
~            3.4.14 AFSCB_Event_RemoveDir
~            3.4.15 AFSCB_Event_Rename
~            3.4.16 AFSCB_Event_Deleted
~            3.4.17 AFSCB_Event_ReleaseLock
~    4 Appendix A: XDR Grammar


1 Introduction

The AFS protocol provides a comprehensive framework for scalable,
secure, wide-area file sharing over IP networks. The AFS system
has historically distinguished itself through its emphasis on
scalability, a key source of which is client-side caching[1, 4].
File data, file and directory metadata, and access control
information may all be cached. Cache consistency is maintained
through client registration and an associated asynchronous
notification mechanism known as the callback.

The current AFS consistency model (which is of larger scope than
the callback mechanism, eg, it includes AFS sync-on-close
semantics) has allowed AFS to scale to large numbers of clients
(tens of thousands today), and to perform well under the
workloads for which AFS was originally designed.

However, AFS does not perform efficiently under other conditions,
such as when more than one client is interested in a file which
is changing--even if the file has only one writer, and many
readers[footnote:
A scenario which competing protocols efficiently support.
]. In general, the AFS protocol arguably (still, considering
improvements made between AFS-2 and AFS-3) places too little
emphasis on efficient caching of mutable data. The current AFS
consistency model is insufficient to correctly support
single-file, multiple-writer scenarios, including those required
for POSIX semantics, and therefore is insufficient to support
many applications which may be run correctly on competing
distributed file systems (eg, CIFS, Novell Netware, or NFSv4).

The efficiency of the current AFS cache management algorithm
could be substantially improved if specific triggering event
information and current status were included in the payload of
the callback notifications sent to clients. In particular,
inclusion of the current DataVersion number and affected byte
ranges in response to StoreData operations would significantly
reduce the need for cache revalidation and reconstruction traffic
in response to callbacks--in many cases, altogether. These
changes would allow efficient support for single-writer updates
on a file with multiple readers. More importantly, they would
permit AFS to correctly and efficiently support multiple writers
updating disjoint ranges on a single file, a prerequisite for
supporting granular file locking (and applications which require
it) in future.

2 The AFS Callback Mechanism

2.1 Description

When an AFS-3 client contacts a file server to perform any of
several operations on a file, or explicitly to fetch its status,
the file server includes in its RPC response an AFSCallBack
structure, representing the server's promise to call back the
client ``if any modifications are made to the data in the file''.[footnot=
e:
A key paper on AFS-2 has ``before allowing a modification by any
other workstation''[1]. The wording of this statement appears
calculated to imply that the file server's promise to execute
callbacks synchronously with the triggering operations (eg,
StoreData) specifically constitutes part of the AFS cache
consistency guarantee. In our analysis, it does not, though it
does contribute strongly to the simplification of the file server
design and to reduction of file server workload.
]The AFSCallBack structure contains the callback expiration time,
and two integer values treated as invariants.

When any client executes an operation which would change a file
(eg, StoreData), and in a variety of other situations, the file
server invalidates the client's cached copy by executing a call
to the CallBack[footnote:
formerly BreakCallback
] procedure in the client's RPC interface. (The call includes in
its arguments an AFSCallBack structure for each file being
invalidated. However, the value of passed AFSCallBack is unused
[eg, afs/afs_callback.c:661-2]). Between the time of issue and
either expiry or receipt of a callback, the client may consider
any information it has cached on a file to be consistent with the
file server's on-disk copy. Conversely, on receipt of a callback,
the client must consider that it knows nothing about the file.
Thus the client must re-establish a relationship with the file at
the file server before executing any further operations on it.[footnote:
Since AFS supports the notion of a read-only volume all of whose
files may only be updated transactionally as a group, AFS permits
a file server to issue a single callback when any file in a
read-only volume is accessed. This is a significant performance
optimization for, by definition, cache management of immutable
data, and so is not discussed further here.
]

The AFS callback mechanism obviates the need for clients to send
frequent cache validation requests before performing operations
on their locally cached copies of objects, reducing network
traffic as well as file server workload[4]. The callback
innovation has been since taken up, with variations, by other
distributed file system protocols[2, 3, 5].

2.2 Analysis

The AFS callback mechanism reliably notifies clients when
information they may have cached becomes invalidated, but omits
to send information it trivially knows, ie, the triggering event,
that could certainly be used by the client to more efficiently
manage cache state.

For example, consider the case where 2 clients A and B are
interested in a file, each having read chunks 1-15 into cache.
Now another client C initiates a change in the file, writing a
new state to chunk 45. This event today triggers a callback, but
also invalidates 30 chunks correctly cached on A and B, which,
should they remain interested, they must refetch (up to 2
megabytes of data, in this case). This scenario may seem
relatively unlikely to occur (but of course, probably does occur
reasonably often in environments where mutable data is common),
but a related scenario involving directory entries (omitted for
brevity) is much more common. In these cases, an AFS callback
mechanism capable of sending triggering event information with
the callback would have facilitated a more efficient result, at
small marginal cost. In another set of scenarios where a client A
has changed data in a file invalidated by non-overlapping stores
by B, a revised mechanism would be capable of delivering a
correct result, whereas a correct result would be impossible with
the mechanism in AFS today.

The justification for sending minimal information with the
callback is presumably to minimize the execution cost of the
callback procedure. The increased cost of sending a limited but
informative callback notification to clients, relative to sending
an uninformative one, is small. Analysis of the OpenAFS file
server code reveals that the file server always has the
information that would logically be sent as extended callback
information in response to file operations (eg, file ranges
affected by StoreData operations, or changed entries for various
directory modification operations).

For these reasons, enhancement of the AFS callback interface to
supply triggering event information seems likely to improve both
correctness and performance of AFS implementations, and
experimental implementation and profiling appear justified.

3 Extended Callback Interface

3.1 Backward Compatibility

AFS clients will indicate their preference to receive extended
callback notifications through a new client capability flag:

const CLIENT_CAPABILITY_EXT_CALLBACK      =3D 0x0002;

3.2 Interface Changes

3.2.1 Procedures

We propose a new procedure ExtendedCallback in the client's RPC
interface. ExtendedCallback follows the style of the traditional
AFS CallBack procedure in accepting parallel sequences of FIDs
and structures. An OUT-direction array of variant
AFSExtendedCallBackResult structures is added for future callback
notification styles (eg, locks, delegations) which may return
structured data on receipt of notifications:

typedef AFSExtendedCallBack AFSExtendedCallBackSeq<AFSCBMAX>;

typedef AFSExtendedCallBackResult
AFSExtendedCallBackRSeq<AFSCBMAX>;



proc ExtendedCallBack(

~    IN HostIdentifier *Server,

~    IN  AFSCBFids *Fids_Array,

~    IN  AFSExtendedCallBackSeq *CallBacks_Array,

~    OUT AFSExtendedCallBackRSeq *CallBack_Result_Array

) multi =3D 65540;

As detailed in section [sub:Constants], AFSExtendedCallBackSeq
resolves to a sequence of AFSExtendedCallBack structures whose
type is an XDR union, discriminated on the callback event type.

3.2.2 Constants<sub:Constants>

The following callback event types are defined:

const AFSCB_Event_Cancel =3D 1; /* extended */

const AFSCB_Event_StoreData =3D 2; /* data in file changed */

const AFSCB_Event_StoreACL =3D 3; /* ACL changed on vnode */

const AFSCB_Event_StoreStatus =3D 4; /* status stored on vnode */

const AFSCB_Event_CreateFile =3D 5; /* file created in directory
vnode */

const AFSCB_Event_MakeDir =3D 6; /* dir created in directory vnode
*/

const AFSCB_Event_Symlink =3D 7; /* symlink created in directory
vnode */

const AFSCB_Event_Link =3D 8; /* hard link created in directory
vnode */

const AFSCB_Event_RemoveFile =3D 9; /* file removd from directory
vnode */

const AFSCB_Event_RemoveDir =3D 10; /* dir removed from directory
vnode */

const AFSCB_Event_Rename =3D 11; /* object renamed (moved) */

const AFSCB_Event_Deleted =3D 12; /* object no longer exists, ex
object */

const AFSCB_Event_ReleaseLock =3D 13; /* traditional AFS lock
released */

A flag constant is provided to indicate callback cancellation
along with an extended notification message of any of the above
types:

const AFSCB_Flag_Cancel =3D 1; /* Callback promise is cancelled */

The following constants indicate reasons for cancellation, when
(Flags & AFSCB_Flag_Cancel)

const AFSCB_Cancel_Shutdown =3D 1;

const AFSCB_Cancel_CallbackGC =3D 2;

const AFSCB_Cancel_VolumeOffline =3D 3;

const AFSCB_Cancel_VolumeMoved =3D 4;

const AFSCB_Cancel_LostMyMind =3D 5;

const AFSCB_Cancel_IHateYou =3D 6;

The following constants indicate direction (from or to called
back FID) in the atomic AFSCB_Event_Rename notification:

const AFSCB_Rename_From =3D 1;

const AFSCB_Rename_To =3D 2;

The following constant indicates a default (void) result type
descriminator for the AFSCB_ResultData union:

const AFSCB_Result_NoResult =3D 1;

3.2.3 Data Types

~  AFSExtendedCallback

The AFSExtendedCallBack data type contains members Flags,
DataVersion, ExpirationTime, and Data, where Flags and ExtraFlags
provide extra information, DataVersion is a (possibly
incremented) DataVersion, ExpirationTime is a (possibly extended)
callback expiration time, and Data is an object of the
discriminated union type AFSCB_NotificationData:

struct AFSExtendedCallBack {

~    afs_uint32 Flags;

~    afs_uint32 ExtraFlags;

~    afs_uint32 DataVersion;

~    afs_uint32 ExpirationTime;

~    AFSCB_NotificationData Data;

};

A positive value in Flags for the AFSCB_Flag_Cancel bit indicates
cancellation of the callback upon receipt of the message. In that
event, a non-zero value of ExtraFlags indicates the reason for
the cancellation.

~  AFSCBFileStatus

The AFSCBFileStatus structure is a reduced-footprint
AFSFetchStatus replacement intended to communicate changed vnode
information in response to StoreData operations:

AFSCBFileStatus {

~    afs_uint64 ClientModTime;

};

~  AFSCBDirStatus

The AFSCBDirStatus structure is a reduced-footprint
AFSFetchStatus replacement intended to communicate changed vnode
information in response to directory change operations:

AFSCBDirStatus {

~    afs_uint32 LinkCount;

~    afs_uint64 ClientModTime;

};

~  AFSCB_NotificationData

AFSCB_NotificationData is a union discriminated by callback event
type, ie, its value may be any of the constants defined in
section [sub:Constants].

union AFSCB_NotificationData switch (afs_uint32 Event_Type) {

case AFSCB_Event_StoreData:

~    AFSCB_Data_StoreData u_store_data;

case AFSCB_Event_StoreACL:

~    void;

case AFSCB_Event_StoreStatus:

~    AFSCB_Data_StoreStatus u_store_status;

case AFSCB_Event_CreateFile:

~    AFSCB_Data_CreateFile u_create_file;

case AFSCB_Event_MakeDir:

~    AFSCBName_Data_MakeDir u_make_dir;

case AFSCB_Event_Symlink:

~    AFSCB_Data_Symlink u_symlink;

case AFSCB_Event_Link:

~    AFSCBName_Data_Link u_link;

case AFSCB_Event_RemoveFile:

~    AFSCB_Data_RemoveFile u_remove_file;

case AFSCB_Event_RemoveDir:

~    AFSCB_Data_RemoveDir u_remove_dir;

case AFSCB_Event_Rename:

~    AFSCB_Data_Rename u_rename;

case AFSCB_Event_Deleted:

~    void;

case AFSCB_Event_ReleaseLock:

~    AFSCB_Data_Lock u_release_lock;

case AFSCB_Event_Cancel:

~    void;

};



The types for the variant member u_data are enumerated and
discussed in detail in section [sub:Callback-Invocations].

3.3 Semantic Changes

A file server MAY send traditional callback messages, with
traditional semantics, to any AFS client in response to any
event. A file server MAY send extended callback notifications to
any client which has announced the capability to use the extended
interface, with the following semantics:

=E2=80=A2 extended callback notification messages, in general, preserve
~  the file server's callback promise to send further
~  notifications for the called-back FID

=E2=80=A2 the file server may revoke the callback promise with any
~  extended callback notification message, by setting the
~  AFSCB_Flag_Cancel bit in the Flags member of the
~  AFSExtendedCallback structure

=E2=80=A2 the AFSCB_Event_Cancel message is similar to a traditional AFS
~  callback, breaking the callback promise, and requesting the
~  client not request further status on the FID

3.3.1 DataVersion Rule

The various extended callback notification messages include
information a client may use to selectively invalidate or
reconstruct its cache. In interpreting each message, the client
MUST observe the dataversion rule, which states:

If the client's cached DataVersion is DataVersion or
(DataVersion-1), the client may invalidate or reconstruct its
cache using the type-dependent information contained in the
message. In all other cases, the client MUST regard the message
as equivalent to a traditional AFS callback.

The semantics of specific callback events are enumerated in
section [sub:Callback-Invocations].

3.4 Callback Invocations<sub:Callback-Invocations>

The various extended callback notification types generally
respond to specific events at the file server, but present a view
of it relevant to a specific callback promise at one client. In
one case (ie, AFSCB_Event_Rename), the file server is sending
notification of an event which effects two FIDs, either or both
of which may be cached by the receiving client. A structure of
type AFSExtendedCallback is sent with each extended callback
notification message, as noted above. Unless otherwise noted, FID
is the FID of the object that is the subject of the callback.

3.4.1 AFSExtendedCallback

The members of the AFSExtendedCallback structures are to be
interpreted as follows:

~  Flags

If the 1-bit (AFSCB_Flag_Cancel) is set, the notification effects
a callback break. The client may make use of the information sent
with the message.

~  ExtraFlags

If (Flags & AFSCB_Flag_Cancel), a non-zero value for ExtraFlags
indicates the reason for cancellation.

~  DataVersion

The value of DataVersion at completion of the event of which the
client is being notified.

~  ExpirationTime

The new expiration time asserted for the server's callback
promise, not necessarily different from the existing expiration
cached by the client.

~  Data

The message-specific data for this notification.

3.4.2 Reasons for Cancellation

The following reasons for cancellation are defined:

~  AFSCB_Cancel_Shutdown

The server or service is shutting down.

~  AFSCB_Cancel_CallbackGC

Callback has been disposed during periodic garbage collection.

~  AFSCB_Cancel_VolumeOffline

The volume associated with FID is now offline.

~  AFSCB_Cancel_VolumeMoved

The volume associated with FID has moved.

~  AFSCB_Cancel_LostMyMind

The server may be having problems related to provisioning an
insufficient number of callback structures.

~  AFSCB_Cancel_IHateYou

Callback has been administratively revoked.

3.4.3 ExtendedCallback Procedure

Extended callbacks are delivered through a new ExtendedCallback
procedure.

proc ExtendedCallBack(

~    IN HostIdentifier *Server,

~    IN  AFSCBFids *Fids_Array,

~    IN  AFSExtendedCallBackSeq *CallBacks_Array,

~    OUT AFSExtendedCallBackRSeq *CallBack_Result_Array

) multi =3D 65540;

ExtendedCallback is modelled on the traditional CallBack
procedure, but adds UUIDs uniquely identifying the file server
host.

3.4.4 Asynchronous Delivery

A server implementation MAY deliver extended callback
notifications asynchronously with respect to the operation which
triggered the notification.

3.4.5 CallBack Coalescing

A server implementation electing to deliver extended callback
notifications asynchronously MAY, in addition, coalesce sequences
of effectively-simultaneous notifications to a single client.

This provision avoids performance regression in scenarios where a
single logical event or operation would otherwise trigger
potentially many notification messages to a single client--for
example, many near-simultaneous CreateFile operations in a single
directory as might occur on expansion of a tar archive, or many
near-simultaneous stores appending to a single file.[footnote:
The callback coalescing concept is re-introduced following
discussions at the 2008 AFS and Kerberos Best Practices Workshop.
]

~  Call Consolidation

A server implementation electing to deliver extended callback
notifications asynchronously MAY coalesce any sequence of
effectively simultaneous notifications into parallel arrays of
FIDs and callback structures, as implied by the type signature of
the ExtendedCallBack procedure. Any number of such callbacks may
be combined, up to the limit of AFSCBMAX.

~  Coalescing of Equivalent Notifications

A server implementation electing to deliver extended callback
notifications asynchronously MAY coalesce a sequence of
effectively simultaneous and equivalent notifications to the same
client into a single callback in a notification message. The
following combinations of operations are explicitly permitted:

=E2=80=A2 sequences of AFSCB_EventStoreAcl notifications on FID may be
~  delivered as a single notification

=E2=80=A2 sequences of AFSCB_EventStoreStatus notifications on FID may be
~  delivered as the single notification of the most recently
~  stored status

=E2=80=A2 sequences of AFSCB_Event_StoreData notifications on the same
~  file at adjacent or overlapping byte ranges may deliver a
~  single notification at the consolidated range

3.4.6 AFSCB_Event_StoreData

The notification is sent in response to a successful StoreData
RPC on FID. A structure of type AFSCB_Data_StoreData is sent with
the message.

struct AFSCB_Data_StoreData {

~    afs_uint64 StoreOffset;

~    afs_uint64 StoreLength;

~    afs_uint64 Length;

~    AFSCBFileStatus FileStatus;

};

StoreLength bytes were stored starting at position StoreOffset in
FID. Length is the current file length and FileStatus contains
the modification time of FID following the operation. The client
must regard cached file data in the range [StoreOffset,
StoreOffset+StoreLength) as invalidated, and may regard data
outside that range as up-to-date. The client MUST discard
undirtied cached data in the invalidated range. The client MAY
send dirtied data in the invalidated range to the file server
prior to discarding (as allowed in current AFS semantics).

3.4.7 AFSCB_Event_StoreACL

ACL and/or access information cached by the client for FID, if
any, is invalidated.

3.4.8 AFSCB_Event_StoreStatus

A StoreStatus RPC was successfully executed on FID. A structure
of type AFSCB_Data_StoreStatus is sent with the message.

struct AFSCB_Data_StoreStatus {

~    struct AFSStoreStatus Status;

};

Status is the new AFSStoreStatus of FID.

3.4.9 AFSCB_Event_CreateFile

A file has been created in the vnode corresponding to FID. A
structure of type AFSCB_Data_CreateFile is sent with the message.

struct AFSCB_Data_CreateFile {

~    string Name<AFSNAMEMAX>;

~    AFSFid Fid;

~    AFSFetchStatus FidStatus;

~    AFSCBDirStatus DirStatus;

};

Name and Fid are, respectively, the name and FID of the created
file. FidStatus is the AFSFetchStatus of the created file, and
DirStatus the current modification time and link count of FID, at
the completion of the call.

3.4.10 AFSCB_Event_MakeDir

A directory has been created in the vnode corresponding to FID. A
structure of type AFSCB_Data_MakeDir is sent with the message.

struct AFSCB_Data_MakeDir {

~    string Name<AFSNAMEMAX>;

~    AFSFid Fid;

~    AFSFetchStatus FidStatus;

~    AFSCBDirStatus DirStatus;

};

Name and Fid are, respectively, the name and FID of the created
directory. FidStatus is the AFSFetchStatus of the created
directory, and DirStatus the current modification time and link
count of FID, at the completion of the call.

3.4.11 AFSCB_Data_Symlink

A symbolic link has been created in the vnode corresponding to
FID. A structure of type AFSCB_Data_Symlink is sent with the
message.

struct AFSCB_Data_Symlink {

~    string Name<AFSNAMEMAX>;

~    string LinkContents<AFSPATHMAX>;

~    AFSFetchStatus FidStatus;

~    AFSCBDirStatus DirStatus;

};

Name is the name of the symbolic link. The link points to
LinkContents. FidStatus is the AFSFetchStatus of the created
symbolic link, and DirStatus the current modification time and
link count of FID, at the completion of the call.

3.4.12 AFSCB_Event_Link

A hard link has been created in the vnode corresponding to FID. A
structure of type AFSCB_Data_Link is sent with the message.

struct AFSCB_Data_Link {

~    string Name<AFSNAMEMAX>;

~    AFSFid LinkTarget;

~    AFSFetchStatus FidStatus;

~    AFSCBDirStatus DirStatus;

};

Name is the name of the hard link. The link is a synonym for
LinkTarget. FidStatus is the AFSFetchStatus of the created
symbolic link, and DirStatus the current modification time and
link count of FID, at the completion of the call.

3.4.13 AFSCB_Event_RemoveFile

A file has been removed from the vnode corresponding to FID. A
structure of type AFSCB_Data_RemoveFile is sent with the message.

struct AFSCB_Data_RemoveFile {

~    string Name<AFSNAMEMAX>;

~    AFSCBDirStatus DirStatus;

};

Name indicates the removed entry. DirStatus the current
modification time and link count of FID, at the completion of the
call.

3.4.14 AFSCB_Event_RemoveDir

A directory has been removed from the vnode corresponding to FID.
A structure of type AFSCB_Data_RemoveDir is sent with the
message.

struct AFSCB_Data_RemoveDir {

~    string Name<AFSNAMEMAX>;

~    AFSCBDirStatus DirStatus;

};

Name indicates the removed entry. DirStatus the current
modification time and link count of FID, at the completion of the
call.

3.4.15 AFSCB_Event_Rename

A file or directory has been renamed, ie moved, from or to the
vnode corresponding to FID. A structure of type
AFSCB_Data_RemoveDir is sent with the message.

const AFSCB_Rename_From =3D 1;

const AFSCB_Rename_To =3D 2;



struct AFSCB_Data_Rename {

~    afs_uint32 Direction;

~    string Name<AFSNAMEMAX>;

~    AFSCBDirStatus FromStatus;

~    AFSCBDirStatus ToStatus;

};

Direction indicates whether FID is the source or the destination
directory of the move. Name indicates the removed entry.
FromStatus is the current modification time and link count of the
source directory vnode, and ToStatus is the current modification
time and link count of the destination directory vnode, at the
completion of the call.

To preserve atomicity, the AFSCB_Data_Rename message is
constructed so that changes to cached copies of both the source
and directory vnodes may be recovered from a single notification.
If a client owns callbacks for both the source and destination
FIDs, a file server MAY elect to send only one notification, for
either the source or the destination FID.

3.4.16 AFSCB_Event_Deleted

The object corresponding to FID not longer exists, and so may no
longer be cached. It is an ex-object.

3.4.17 AFSCB_Event_ReleaseLock

A traditional AFS whole-file lock has been released on FID. A
structure of type AFSCB_Data_Lock is sent with the message.

struct AFSCB_Data_Lock {

~    afs_uint32 LockType;

};

LockType is the type of the lock released.

Receipt of an AFSCB_Event_ReleaseLock notification in no way
implies an intention on the part of a file server to grant a lock
on FID to client. Non-receipt of a notification of this type in
no way implies non-release of locks that may be held on FID. The
file server SHOULD send notifications of this type only to
clients which have indicated probable interest in the event, eg,
by having recently requested a lock on FID.

4 Appendix A: XDR Grammar

#include "common.xg" /*Common structures & definitions*/

%#ifdef KERNEL

%#include "../afs/longc_procs.h"

%#endif



package RXAFSCB_

prefix S

statindex 6



/* callback event types, predominantly events on the vnode for

* which the callback is being made, but also (eg, Deleted) side

* effects of operations on related vnodes */

const AFSCB_Event_Cancel =3D 1;       /* explicit cancel--callback
promise

~                                     * is broken, dont' bother
fetching

~                                     * new status */

const AFSCB_Event_StoreData =3D 2;    /* data in file changed */

const AFSCB_Event_StoreACL =3D 3;     /* ACL changed on vnode */

const AFSCB_Event_StoreStatus =3D 4;  /* status stored on vnode */

const AFSCB_Event_CreateFile =3D 5;   /* file created in directory
vnode */

const AFSCB_Event_MakeDir =3D 6;      /* dir created in directory
vnode */

const AFSCB_Event_Symlink =3D 7;      /* symlink created in
directory vnode */

const AFSCB_Event_Link =3D 8;         /* hard link created in
directory vnode */

const AFSCB_Event_RemoveFile =3D 9;   /* file removed from
directory vnode */

const AFSCB_Event_RemoveDir =3D 10;   /* dir removed from directory
vnode */

const AFSCB_Event_Rename =3D 11;      /* object renamed (moved) */

const AFSCB_Event_Deleted =3D 12;     /* object no longer exists,
ex object */

const AFSCB_Event_ReleaseLock =3D 13; /* traditional AFS lock
released */



/* flags indended for use in AFSExtendedCallback Flags */

const AFSCB_Flag_Cancel =3D 1; /* Callback promise is cancelled */



/* flags intended for use in AFSExtendedCallback ExtraFlags,

~ * when (flags & AFSCB_Flag_Cancel), to indicate reason for

~ * cancellation */

const AFSCB_Cancel_Shutdown =3D 1;

const AFSCB_Cancel_CallbackGC =3D 2;

const AFSCB_Cancel_VolumeOffline =3D 4;

const AFSCB_Cancel_VolumeMoved =3D 8;



/* identical with decl in afsint.xg--this should move to
common.xg */

struct AFSStoreStatus {

~    afs_uint32 Mask;

~    afs_uint32 ClientModTime;

~    afs_uint32 Owner;

~    afs_uint32 Group;

~    afs_uint32 UnixModeBits;

~    afs_uint32 SegSize;

};



/* differential status to be send with StoreData msgs */

AFSCBFileStatus {

afs_uint64 ClientModTime;

};



/* differential status to be sent with directory change msgs */

AFSCBDirStatus {

afs_uint32 LinkCount;

afs_uint64 ClientModTime;

};



struct AFSCB_Data_StoreData {

~    afs_uint64 StoreOffset;

~    afs_uint64 StoreLength;

~    afs_uint64 Length;

~    AFSCBFileStatus FileStatus;

};



struct AFSCB_Data_StoreStatus {

~    struct AFSStoreStatus Status;

};



struct AFSCB_Data_CreateFile {

~    string Name<AFSNAMEMAX>;

~    AFSFid Fid;

~    AFSFetchStatus FidStatus;

~    AFSCBDirStatus DirStatus;

};



struct AFSCB_Data_MakeDir {

~    string Name<AFSNAMEMAX>;

~    AFSFid Fid;

~    AFSFetchStatus FidStatus;

~    AFSCBDirStatus DirStatus;

};



struct AFSCB_Data_Symlink {

~    string Name<AFSNAMEMAX>;

~    string LinkContents<AFSPATHMAX>;

~    AFSFetchStatus FidStatus;

~    AFSCBDirStatus DirStatus;

};



struct AFSCB_Data_Link {

~    string Name<AFSNAMEMAX>;

~    AFSFid LinkTarget;

~    AFSFetchStatus FidStatus;

~    AFSCBDirStatus DirStatus;

};



struct AFSCB_Data_RemoveFile {

~    string Name<AFSNAMEMAX>;

~    AFSCBDirStatus DirStatus;

};



struct AFSCB_Data_RemoveDir {

~    string Name<AFSNAMEMAX>;

~    AFSCBDirStatus DirStatus;

};



const AFSCB_Rename_From =3D 1;

const AFSCB_Rename_To =3D 2;



struct AFSCB_Data_Rename {

~    afs_uint32 Direction;

~    string Name<AFSNAMEMAX>;

~    AFSCBDirStatus FromStatus;

~    AFSCBDirStatus ToStatus;

};



struct AFSCB_Data_Lock {

~    afs_uint32 LockType;

};



union AFSCB_NotificationData switch (afs_uint32 Event_Type) {

case AFSCB_Event_StoreData:

~    AFSCB_Data_StoreData u_store_data;

case AFSCB_Event_StoreACL:

~    void;

case AFSCB_Event_StoreStatus:

~    AFSCB_Data_StoreStatus u_store_status;

case AFSCB_Event_CreateFile:

~    AFSCB_Data_CreateFile u_create_file;

case AFSCB_Event_MakeDir:

~    AFSCBName_Data_MakeDir u_make_dir;

case AFSCB_Event_Symlink:

~    AFSCB_Data_Symlink u_symlink;

case AFSCB_Event_Link:

~    AFSCBName_Data_Link u_link;

case AFSCB_Event_RemoveFile:

~    AFSCB_Data_RemoveFile u_remove_file;

case AFSCB_Event_RemoveDir:

~    AFSCB_Data_RemoveDir u_remove_dir;

case AFSCB_Event_Rename:

~    AFSCB_Data_Rename u_rename;

case AFSCB_Event_Deleted:

~    void;

case AFSCB_Event_ReleaseLock:

~    AFSCB_Data_Lock u_release_lock;

case AFSCB_Event_Cancel:

~    void;

};



/* extended callback structure */

struct AFSExtendedCallBack {

~    afs_uint32 Flags;

~    afs_uint32 ExtraFlags;

~    afs_uint32 DataVersion;

~    afs_uint32 ExpirationTime;

~    AFSCB_NotificationData Data;

};



/* Forward-looking union for callback results */

union AFSCB_ResultData switch (afs_uint32 Result_Type) {

case AFSCB_Result_NoResult:

~    void;

};



typedef AFSExtendedCallBack AFSExtendedCallBackSeq<AFSCBMAX>;



/* extended callback result structure */

struct AFSExtendedCallBackResult {

~    afs_uint32 Flags;

~    afs_uint32 ExtraFlags;

~    AFSCB_ResultData Data;

};



/* this prototype follows the style of the traditional AFS

~ * CallBack proc, but is intended to support coalescing of
callbacks */



typedef AFSExtendedCallBackResult
AFSExtendedCallBackRSeq<AFSCBMAX>;



proc ExtendedCallBack(

~    IN HostIdentifier *Server,

~    IN  AFSCBFids *Fids_Array,

~    IN  AFSExtendedCallBackSeq *CallBacks_Array,

~    OUT AFSExtendedCallBackRSeq *CallBack_Result_Array

) multi =3D 65540;

References

[1] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.

[2] Howard, J.H., Kazar, M.L., Menees, S.G., Nichols, D.A.,
Satyanarayanan, M., Sidebotham, R.N. and West, M. "Scale and
Performance in a Distributed File System" ACM Transactions on
Computer Systems, February 1988

[3] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame,
C., Eisler, M., and D. Noveck, "Network File System (NFS) version
4 Protocol", RFC 3530, April 2003.

[4] Edward R Zayas, "AFS-3 Programmer's Reference: File
Server/Cache Manager Interface", Transarc Corporation,
FS-00-D162, 20th August 1991

[5] Paul J. Leach, Dilip C. Naik. A Common Internet File System
(CIFS/1.0) Protocol
[http://www.tools.ietf.org/html/draft-leach-cifs-v1-spec-01],
1997.

[6] Kazar, Michael Leon, "Synchronization and Caching Issues in
the Andrew File System," USENIX Conference Proceedings, USENIX
Association, Berkeley, CA, Dallas Winter 1988, pages 27-36.

[7] Lily B. Mummert, Mahadev Satyanarayanan: Large Granularity
Cache Coherence for Intermittent Connectivity. USENIX Summer
1994: 279-289




- --

Matt Benjamin

The Linux Box
206 South Fifth Ave. Suite 150
Ann Arbor, MI  48104

http://linuxbox.com

tel. 734-761-4689
fax. 734-769-8938
cel. 734-216-5309

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIQzlIJiSUUSaRdSURCLMPAJ4/oFZwtr7E5l6cB+lpHdi+qrS0UACfVq76
3FwiZjzOdSYq7Mx+kbXgC6I=3D
=3Dn1wl
-----END PGP SIGNATURE-----