[AFS3-std] byte-range locking and delegation

Matt Benjamin matt@linuxbox.com
Mon, 27 Oct 2008 20:37:56 -0400


This is a multi-part message in MIME format.
--------------050102070305060407030308
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Attached please find a draft proposal proposing interfaces for
byte-range locking and delegation, and supporting semantic discussion.

The draft has seen 2 rounds of early review, and concepts in the
document were discussed at the OpenAFS Hackathon at Ohiolinux.  Thanks
to all who have assisted.

Please assist further :)

Thanks,

Matt

- --

Matt Benjamin

The Linux Box
206 South Fifth Ave. Suite 150
Ann Arbor, MI  48104

http://linuxbox.com

tel. 734-761-4689
fax. 734-769-8938
cel. 734-216-5309

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJBl7kJiSUUSaRdSURCLYnAJ9u6u/kLigBiuZicbHa9NiOldklHACcCDBF
mmTT0GnNBdD4a1JGpi94S7U=
=2/zW
-----END PGP SIGNATURE-----

--------------050102070305060407030308
Content-Type: text/plain;
 name="locking_delegation_d7.txt"
Content-Disposition: inline;
 filename="locking_delegation_d7.txt"
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable by aa.linuxbox.com id
	m9S0cA56006278

AFS Byte-Range Locking and Delegation

Matt Benjamin <matt@linuxbox.com>

10/26/2008

Status of this Memo

This document specifies a standards track protocol extension for=20
the OpenAFS community, and requests discussion and suggestions=20
for improvements.

Key Words

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL=20
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and=20
"OPTIONAL" in this document are to be interpreted as described in=20
Internet Engineering Task Force RFC 2119.

Abstract

The AFS-3 protocol supports file locks, but only on whole files,=20
only in advisory mode, and using an inefficient protocol.=20
Efficient support for byte-range file locking, together with the=20
stronger semantics with which they are associated, are required=20
to improve the suitability of AFS as a LAN file-sharing protocol=20
for both Unix and Windows clients. Applications on the Windows=20
platform, in particular (e.g., Microsoft Office), actually=20
require byte-range locking to function correctly. Emulation in=20
the client has alleviated most serious problems, albeit, with=20
reduced semantics.=20

We propose protocol enhancements facilitating server-coordinated=20
byte-range locks, atomic lock up/down-grade support, improved=20
semantics for files under byte-range lock control, protocol=20
support for wait-on-lock with fairness, and mandatory lock=20
enforcement for clients on request. A conditional strengthened=20
callback semantics (``delegation''), governing file data and=20
locks, is proposed to reduce network and file-server workload for=20
uncontested file lock operations.

Table of Contents

Status of this Memo
Key Words
Abstract
    1 AFS-3 File Locking
        1.1 Analysis
    2 Byte-Range Locking Interfaces
        2.1 Dependencies
        2.2 Backward Compatibility
        2.3 Concepts
            2.3.1 General
            2.3.2 Lock Management
            2.3.3 Deferred Locks
        2.4 Constants
            2.4.1 Lock Flags
                AFSLock_Flag_Mand
                AFS_LockFlagWait
            2.4.2 Lock Status
                AFSLock_Flag_Extend_Ok
                AFSLock_Flag_Undelegate_Ok
            2.4.3 Callback Constants
            2.4.4 Callback Result Constants
                AFSCB_Cancel_ExtendLocks
                AFSCB_Cancel_RevokeLocks
                AFSCB_Flag_ExtendLocks
                AFSCB_Flag_ExtendLocks
        2.5 Data Types
            2.5.1 AFSByteRangeLock
                Fid
                Type
                Owner
                Uniq
                Offset
                Length
                ExpirationTime
            2.5.2 AFSByteRangeLockSeq
            2.5.3 AFSLockFlagsSeq
            2.5.4 HostIdentifierSeq
            2.5.5 AFSCB_ResultData Redefinition
                AFSCB_Result_ReturnLocks
                AFSCB_Result_ResponseDeferred
        2.6 Procedures
            2.6.1 SetByteRangeLock
                Notes
            Error Codes
                EACCES
                EWOULDBLOCK
                EDEADLK=20
                EINVAL
                ENOLCK
            2.6.2 ReleaseByteRangeLock
            Notes
            Error Codes
                EINVAL
            2.6.3 UpgradeByteRangeLock
            Error Codes
                EINVAL
                EWOULBLOCK
                EDEADLK
            2.6.4 DowngradeByteRangeLock
            Notes
            Error Codes
                EINVAL
            2.6.5 GetByteRangeLockStatus=20
            Error Codes
                EACCES
            2.6.6 CancelByteRangeLock
            2.6.7 AssertExtendLocks
        2.7 Windows & Unix Lock Semantics
            2.7.1 Byte-Range Locking
            2.7.2 Read/Write vs. Shared/Exclusive
            2.7.3 Atomic Lock Open
        2.8 Mandatory Enforcement
            2.8.1 Governing Ideas
            2.8.2 Enforcement Rules
    3 Delegation
        3.1 Dependencies
        3.2 Backward Compatibility
        3.3 Lock Delegation
        3.4 File Delegation
            3.4.1 Semantic Changes
            3.4.2 Delegation
            3.4.3 Revocation
        3.5 Constants
            3.5.1 Delegation Types
                AFS_DType_General
            3.5.2 Callback Constants
                AFSCB_Flag_Delegation
                AFSCB_Cancel_RevokeDelegation
                AFSCB_Flag_RevokeDelegation
                AFSCB_Flag_ExtremePrejudice
        3.6 DataTypes
            3.6.1 AFSDelegation
                Fid
                Type
                Flags
                Offset
                Length
                ExpirationTime
            3.6.2 AFSExtendedCallBack
        3.7 Procedures
            3.7.1 RequestDelegation
                Fid
                Type
                Flags
                Offset
                Length
                Delegation
            Error Codes
                EACCES
                EWOULDBLOCK
                EINVAL
            3.7.2 UndelegateReturningLocks
    4 Appendix A: XDR Grammar (afsint.xg)
    5 Appendix A: XDR Grammar (afscbint.xg)


1 AFS-3 File Locking

While AFS-3 does support file locking, it permits locking of=20
whole-files only, and provides this support inefficiently. AFS=20
clients can take locks on any file object, with the granularity=20
of an entire file, using the RXAFS_SetLock procedure, and release=20
them with the RXAFS_ReleaseLock procedure. AFS uses a poll-based=20
locking model. AFS file locks, once issued, are considered to=20
persist only for 5 minutes, unless extended by the requesting=20
client using the RXAFS_ExtendLock procedure. This simplifies the=20
AFS file server, but complicates clients and wastes network=20
capacity. The OpenAFS file server implementaion, based on the=20
original Transarc AFS file server, tracks locks directly in its=20
on-disk volume structures. Considering the 5-minute duration=20
asserted for file locks, the reason for this decision is clearly=20
not to support lock persistence for long periods, although it may=20
have been intended to allow locks to persist through server=20
restarts (or crashes). The disk package tracks lock type=20
(LockRead or LockWrite), numbers of clients holding locks, and a=20
timestamp. Lock ownership, while in many cases may be reliably=20
inferred, is not recorded. Hence, a broken or malicious client=20
might release locks it never set (i.e., locks set by other=20
clients). The AFS protocol also does not permit atomic lock=20
upgrades (or downgrades).

1.1 Analysis

The AFS locking protocol is unfair, and wasteful of client and=20
network resources. We propose solutions to fairness and=20
efficiency problems in this proposal.

2 Byte-Range Locking Interfaces

2.1 Dependencies

The byte-range lock feature depends on support for extended=20
callback notifications and extended host tracking support in=20
client and server.

2.2 Backward Compatibility

AFS clients and servers will indicate their support for=20
byte-range locking through new client and file server capability=20
flags:

const CLIENT_CAPABILITY_BYTE_RANGE_LOCK =3D 0x0008;

const VICED_CAPABILITY_BYTE_RANGE_LOCK =3D 0x0010;

2.3 Concepts

2.3.1 General

An AFS file server is responsible to coordinate byte-range=20
locking requests and, optionally, enforce mandatory locking=20
semantics relative to file operations, initiated at different=20
clients. By contrast with the traditional AFS file locking=20
protocol, the proposed byte-range locking protocol makes an=20
attempt to associate locks with a unique subject, specifically, a=20
ViceID and unique identifier which could correspond to a unique=20
session or process executing on the client machine. Clients=20
(cache-manager processes not co-located in memory) request and=20
release byte-range locks through a pair of interfaces=20
(RequestByteRangeLock, ReleaseByteRangeLock) similar to those=20
provided by the traditional AFS locking implementation. The same=20
lock types (read and write, in general regarded as ``shared'' or ``
exclusive'') locks are defined as in traditional AFS locking.=20
Additional arguments and flags are provided to permit selection=20
of desired lock ranges, intention to ``wait'' on the lock (i.e.,=20
willing to accept a deferred issue of the lock at such time as=20
the file server can grant the lock, if it cannot be granted=20
immediately), and desired special semantics--currently, the=20
client may request mandatory enforcement. Clients already holding=20
a read or write lock on a range may atomically upgrade or=20
downgrade the lock to the orthogonal type, i.e., they need not=20
release a lock of one type before requesting the other type,=20
avoiding the race condition present in the traditional AFS=20
locking protocol. Byte-range locks are permanently associated=20
with an owner, the client which requested the lock. A lock may=20
not be released by a client which never owned it. Administrative=20
users may under various circumstances have need to identify the=20
owner and state of locks on a locked file, and to revoke file=20
locks administratively. This proposal includes RPCs allowing=20
administrative users to perform these operations, and suggests=20
exposure through new AFS pioctls and the fs command.

2.3.2 Lock Management

Lock management in the proposed interface is completely redefined=20
relative to the file locking in AFS-3. Concepts are borrowed from=20
AFS cache management, including the callback concept. A=20
byte-range lock may be regarded as a special-purpose callback. A=20
file server may use the ExtendedCallBack interface to request=20
re-assertion of existing locks, revoke file delegations (which=20
may include client-issued byte-range locks), or cancel locks=20
completely.

2.3.3 Deferred Locks

Where possible, locks are granted immediately with the completion=20
of the SetByteRangeLock request. A file server MAY, on explicit=20
request and subject to client capability, agree to prospectively=20
issue a lock to an interested client at a future time, when the=20
requested lock becomes available. Such deferred locks constitute=20
a promise to issue the lock with best-effort consideration of=20
fairness. A new procedure in the client RPC interface=20
(AsyncIssueByteRangeLock) is provided to effect asynchronous=20
issue of a deferred lock to a waiting client. Deferred locks may=20
themselves be canceled.

2.4 Constants

2.4.1 Lock Flags

The following flag constants are defined for use in the Flags=20
member of the AFSByteRangeLock structure and equivalently in the=20
Flags argument of the SetByteRangeLock procedure, with the same=20
semantics:

const AFSLock_Flag_Mand =3D 1; /* req. enforcement */

const AFSLock_Flag_Wait =3D 2; /* req. async wait on lock */

  AFSLock_Flag_Mand

Requests mandatory enforcement when sent with a SetByteRangeLock=20
request or in a deferred AFSByteRangeLock instance. Asserts=20
mandatory enforcement in an AFSByteRangeLock instance.

  AFS_LockFlagWait

Requests deferred lock if immediate lock cannot be granted when=20
sent with a SetByteRangeLock request. Indicates deferred lock in=20
an AFSByteRangeLock instance. The SetByteRangeLock procedure may=20
return locks in this state, subject to client capability and if=20
so requested in the Flags argument.

2.4.2 Lock Status

The following flag constants are provided to coordinate advanced=20
lock-management operations:

const AFSLock_Flag_Extend_Ok =3D 4; /* extended */

const AFSLock_Flag_Undelegate_Ok =3D 8; /* undelegated, asserted */

  AFSLock_Flag_Extend_Ok

Returned from AssertExtendLocks in OutStatus array, indicates=20
lock confirmation.

  AFSLock_Flag_Undelegate_Ok

Returned from UndelegateReturningLocks in OutStatus array,=20
indicates server agreement to assert undelegated lock.

2.4.3 Callback Constants

The following callback cancellation types and flags are provided,=20
to facilitate lock management through the ExtendedCallback=20
interface:

const AFSCB_Cancel_ExtendLocks =3D 7; /* re-assert locks, or lose=20
them */

const AFSCB_Cancel_RevokeLocks =3D 8; /* locks on Fid revoked */

2.4.4 Callback Result Constants

The following constant is provided as a descriminator for the=20
AFSCB_ResultData member of AFSCBExtendedCallbackResult allowing=20
clients to indicate their intention to defer returning locks or=20
delegations in a subsequent RPC on the file server:

const AFSCB_Result_ResponseDeferred =3D 2;

The following constant is provided as a descriminator for the=20
AFSCB_ResultData member of AFSCBExtendedCallbackResult allowing=20
clients to indicate their intention to return locks in the=20
CallBack_Result_Array OUT parameter:

const AFSCB_Result_ReturnLocks =3D 3;

  AFSCB_Cancel_ExtendLocks

When sent as the reason for cancellation in an ExtendedCallback=20
notification, indicates the server requires re-assertion of all=20
locks on FID using the file server's AssertExtendLocks procedure.=20
The client MUST execute the procedure for all locks it asserts on=20
FID prior to the ExpirationTime in the callback, else it MUST=20
consider any locks it held on FID to be canceled.

  AFSCB_Cancel_RevokeLocks

When sent as the reason for cancellation in an ExtendedCallback=20
notification, indicates administrative cancellation of all locks=20
on FID.

const AFSCB_Flag_AssertLocks =3D 4; /* request ExtendLock */

const AFSCB_Flag_RevokeLocks =3D 8; /* locks cancelled, sorry */

  AFSCB_Flag_ExtendLocks

Has the same meaning and effect as AFSCB_Cancel_ExtendLocks, but=20
may be sent with an arbitrary extended callback message.

  AFSCB_Flag_ExtendLocks

Has the same meaning and effect as AFSCB_Cancel_RevokeLocks, but=20
may be sent with an arbitrary extended callback message.

2.5 Data Types

2.5.1 AFSByteRangeLock

The AFSByteRangeLock data type represents a byte-range lock=20
issued by an AFS file server:

struct AFSByteRangeLock {

    AFSFid Fid;

    afs_uint32 Type;

    afs_uint32 Owner;

    afs_uint32 Uniq;

    afs_uint32 Flags;

    afs_uint64 Offset;

    afs_uint64 Length;

    afs_uint64 ExpirationTime;

};

  Fid

The Fid on which the lock is held.=20

  Type

The type of lock requested, LockRead or LockWrite. A byte-range=20
read lock is a non-exclusive read assertion on the stated range,=20
which may be shared by any number of readers and no writers. A=20
byte-range lock is an exclusive write assertion on the stated=20
range.

  Owner

The ViceID in use by the client requesting the lock.

  Uniq

Value uniquely identifying a session or process context at the=20
client.

  Offset

The distance in bytes from beginning-of-file to the start of the=20
locked range.

  Length

Length in bytes of the locked range.

  ExpirationTime

AFSByteRangeLock instances may be regarded as a special-purpose=20
callback. Instances persist until canceled, or until=20
ExpirationTime is reached.

2.5.2 AFSByteRangeLockSeq

A variable-length array of type AFSByteRangeLock used for bulk=20
calls for asserting and returning locks recalled from delegation.

const AFS_LOCK_SEQ_MAX =3D 10000;

typedef AFSByteRangeLock AFSByteRangeLockSeq <AFS_LOCK_SEQ_MAX>;

2.5.3 AFSLockFlagsSeq

An array of flags used in parallel with AFSByteRangeLockSeq,=20
above.

const AFS_LOCK_SEQ_MAX =3D 10000;

typedef afs_int32 AFSLockFlagsSeq <AFS_LOCK_SEQ_MAX>;

2.5.4 HostIdentifierSeq

const AFS_LOCK_SEQ_MAX =3D 10000;

typedef AFSLockHostIdentifierSeq <AFS_LOCK_SEQ_MAX>;

An array of HostIdentifier structures used by the=20
GetByteRangeLockStatus procedure to report client machines=20
holding locks.

2.5.5 AFSCB_ResultData Redefinition

The AFSCB_ResultData union defined in the Callback Extended=20
Information draft is redefined (upward compatibly), as the=20
following:

union AFSCB_ResultData switch (afs_uint32 Result_Type) {

case AFSCB_Result_NoResult:

    void;

case AFSCB_Result_ResponseDeferred:

    void;

case AFSCB_Result_ReturnLocks:

    AFSByteRangeLockSeq AssertedLocks_Array;

};

  AFSCB_Result_ReturnLocks

The result is used to return (synchronously, in the=20
ExtendedCallBack RPC) a list of byte-range locks being extended=20
in response to an extended callback notification of type=20
AFSCB_Flag_AssertLocks, or asserted in response to one of type=20
AFSCB_Cancel_RevokeDelegation or sent with the flag=20
AFSCB_Flag_RevokeDelegation.

  AFSCB_Result_ResponseDeferred

The result is used to indicate that the client will not assert or=20
return locks synchronously in the ExtendedCallBack RPC (and will=20
instead assert or return locks using the asychronous RPCs=20
provided.)

2.6 Procedures

2.6.1 SetByteRangeLock

Requests a lock of type Type on Fid, on the range [Offset,=20
Offset+Length). Type must be one of LockRead or LockWrite. Owner=20
shall be set to the ViceID corresponding to the requesting=20
process or equivalent, or to 0 if this is not known. Uniq shall=20
be set to a value uniquely identifying the requesting process or=20
equivalent. On Unix-like systems, Uniq could be set to the PID of=20
the requesting process.

proc SetByteRangeLock(

    IN AFSFid *Fid,

        afs_uint32 Type,

        afs_uint32 Flags,

        afs_uint32 Owner,

        afs_uint32 Uniq,

        afs_uint64 Offset,

        afs_uint64 Length,

    OUT AFSByteRangeLock *Lock

) =3D 65601;

  Notes

On successful return the file server has granted the requested=20
lock, and Lock points to the server's asserted AFSByteRangeLock=20
structure. If the client has requested and the server agrees to=20
issue a deferred lock, Lock points to the server's asserted=20
deferred AFSByteRangeLock structure. The client may safely=20
determine if it has been granted a deferred lock by inspecting=20
the value of Lock->Flags.

The returned Lock structure MUST NOT differ from the request with=20
respect to range, except in the case where the requested lock=20
would overlap with a lock of the same type already held by the=20
same client, in which case, the locks are merged and the merged=20
range returned in Lock. The returned Lock structure MAY differ=20
from request with respect to Flags.

The value of the Flags argument may alter the semantics and/or=20
processing of the call:

=E2=80=A2 if (Flags & AFSLock_Flag_Mand), file server is requested to=20
  enforce mandatory locks on writes to or truncate overlapping=20
  with the locked range--if the file server is willing to provide=20
  mandatory enforcement, it MAY set the corresponding flag in=20
  Lock, and if so MUST restrict writes on the asserted range to=20
  the holding client for the duration of the lock

=E2=80=A2 if (Flags & AFSLock_Flag_Wait), file server is requested to=20
  issue a deferred lock if the requested lock may not be=20
  immediately granted--the file server MAY grant a deferred lock=20
  in response to this request, indicating its agreement by=20
  setting the corresponding flag in Lock. Lock is in this=20
  instance an indicator only of the deferred lock promise

  Error Codes

  EACCES

The caller does not have the necessary rights.

  EWOULDBLOCK

The server is unable to grant the request due to conflicting=20
locks. If a deferred lock was requested, a Flags value of=20
AFSLock_Flag_Wait indicates the deferred lock is granted.

  EDEADLK=20

The server declines to grant the requested lock (or deferred=20
lock) because granting it would cause a deadlock.

  EINVAL

An illegal lock type was specified.

  ENOLCK

The server has insufficient resources to grant the lock, or the=20
requesting client or file has too many locks outstanding. (No=20
specific limits are mandated or suggested by this document.)

2.6.2 ReleaseByteRangeLock

Releases the byte-range lock represented in Lock, asserted to be=20
held by the calling client.

proc ReleaseByteRangeLock(

    IN AFSByteRangeLock *Lock

) =3D 65602;

  Notes

When an AFS client intends to release a byte-range write lock, it=20
MUST ensure that any changed data in the effected range has been=20
sent to the file server with the appropriate StoreData RPC, and=20
that the RPC completed successfully. This requirement is based on=20
an implied assertion that holding a lock on some region of a file=20
implies, invariantly, an up-to-date view on the locked region.

  Error Codes

  EINVAL

The caller does not own the corresponding lock.

2.6.3 UpgradeByteRangeLock

Upgrades the byte-range lock represented in Lock, asserted to be=20
held by the calling client, from its current type (which should=20
be LockRead) to LockWrite. The upgrade is executed atomically (no=20
opportunity exists for another client to set a conflicting lock=20
in the upgraded range while the upgrade is being executed).

proc UpgradeByteRangeLock(

    IN AFSByteRangeLock *Lock,

    afs_uint32 Type

) =3D 65603;

  Error Codes

  EINVAL

The caller does not own the corresponding lock or it is not of=20
the correct type.

  EWOULBLOCK

The lock could not be granted due to conflicting locks.

  EDEADLK

The lock could not be granted because granting it, with deferral,=20
would cause deadlock.

2.6.4 DowngradeByteRangeLock

Downgrades the byte-range lock represented in Lock, asserted to=20
be held by the calling client, from its current type (which=20
should be LockWrite) to LockRead. The downgrade is executed=20
atomically (no opportunity exists for another client to set a=20
conflicting lock in the downgraded range while the downgrade is=20
being executed).

proc DowngradeByteRangeLock(

    IN AFSByteRangeLock *Lock,

    afs_uint32 Type

) =3D 65604;

  Notes

When an AFS client intends to downgrade a byte-range write lock,=20
it MUST ensure that any changed data in the effected range has=20
been sent to the file server with the appropriate StoreData RPC,=20
and that the RPC completed successfully. This requirement is=20
based on an implied assertion that holding a lock on some region=20
of a file implies, invariantly, an up-to-date view on the locked=20
region.

(Allowing the store obligation to be transfered to the release of=20
the read lock that should result from the DowngradeByteRangeLock=20
call is theoretically justified, but weakens consistency, and=20
does not seem to entail any strong benefit to the client.)

  Error Codes

  EINVAL

The caller does not own the corresponding lock or it is not of=20
the correct type.

2.6.5 GetByteRangeLockStatus=20

Diagnostic procedure provided to permit system administrators to=20
identify client machines and software running on those clients=20
that are currently holding locks on a file. Fid is the file to=20
report on. The call returns parallel variable-length arrays of=20
locks and their associated hosts. The procedure may only be=20
executed by the AFS super user or members of the=20
system:administrators group.

proc GetByteRangeLockStatus(

    IN Fid,

    OUT AFSByteRangeLockSeq *AssertedLocks_Array,

        AFSLockHostIdentifierSeq *Clients_Array

) =3D 65605;

  Error Codes

  EACCES

The caller does not have the necessary rights.

2.6.6 CancelByteRangeLock

The CancelByteRangeLock procedure permits system administrators=20
to revoke active locks that may be obstructing normal operations,=20
perhaps due to a system or network problem. Fid is the file on=20
which to revoke locks. If successful, all locks in range [Offset,=20
Offset+Length) are canceled If a value of 0 is given for Offset=20
and Length the range is taken to span the entire file. The=20
procedure may only be executed by the AFS super user or members=20
of the system:administrators group.

proc CancelByteRangeLocks(

    IN Fid,

       afs_uint64 Offset,

       afs_uint64 Length

) =3D 65606;

2.6.7 AssertExtendLocks

On receipt of an AFSCB_Cancel_ExtendLocks or=20
AFSCB_Flag_ExtendLocks notification through the extended callback=20
interface, a client MUST either:

=E2=80=A2 return any locks it asserts in AssertedLocks_Array, the type of=
=20
  union AFSCB_ResultData for these calls

  =E2=80=93 if the server rejects any locks asserted by the client, it=20
    will so notify client in a subsequent cancellation message

=E2=80=A2 set a result of AFSCB_Result_ResponseDeferred, and execute the=20
  AssertExtendLocks bulk call before the ExpirationTime in the=20
  AFSExtendedCallback structure sent with the callback

Fid is the file for which locks are being extended. Flags=20
contains indication of special semantics (e.g., mandatory=20
enforcement) being asserted, if any. AssertedLocks_Array points=20
to a variable length array of AFSByteRangeLock structures the=20
client asserts to hold. At the completion of the call, the=20
parallel array OutResult indicates the server's confirmation (or=20
refusal) to extend each asserted lock--a value of (Flags &=20
AFSLock_Flag_Extend_Ok) indicates confirmation.

/* Assert locks on Fid, on request */

AssertExtendLocks(

    IN AFSFid Fid,

        afs_uint32 Flags,

        AFSByteRangeLockSeq *AssertedLocks_Array,

    OUT AFSLockFlagsSeq *OutResult

) =3D 65607;

2.7 Windows & Unix Lock Semantics

Implementation of interoperable locking behavior presents=20
challenges for a distributed file system like AFS, which must=20
support clients on platforms which do not agree precisely on the=20
semantics desirable or possible to enforce.

2.7.1 Byte-Range Locking

As byte-range locking is effectively required for correct=20
behavior of Windows applications, the OpenAFS for Windows client=20
has been forced to implement a locally-enforced byte-range=20
locking mechanism. In the Windows client today, local byte-range=20
are shadowed by a whole-file lock in AFS. With the introduction=20
of server-coordinated byte-range locking, the Windows client is=20
expected to use server byte-range locks when possible.

2.7.2 Read/Write vs. Shared/Exclusive

In the current OpenAFS for Windows client, shared (whole-file)=20
locks are mapped to AFS read locks, and Windows exclusive=20
(whole-file) locks are mapped to AFS write locks. This mapping=20
applies equally for byte-range locks.

2.7.3 Atomic Lock Open

Windows provides the ability to open and lock a file in a single=20
operation, and key Windows applications such as Microsoft Office=20
rely this behavior. Although this behavior has no direct=20
equivalent in the AFS protocol (which does not provide an OPEN=20
file operation) the correct behavior from the point of view of=20
Windows applications is already emulated by the Windows client.

2.8 Mandatory Enforcement

Mandatory enforcement of file locks is considered a requirement=20
for Windows interoperation. The rules proposed here reflect some=20
consideration and discussion of unique features in AFS, and also=20
compromises made in competing systems intended to support mixed=20
Windows and Unix clients, particularly NFSv4.

2.8.1 Governing Ideas

=E2=80=A2 Byte-range locks may be taken out on a file under the same=20
  circumstances under which a whole file might be taken out in=20
  traditional AFS

=E2=80=A2 Clients asserting advisory locks on a file by definition do not=
=20
  expect any special semantics from the file system; however, it=20
  seems logically reasonable that advisory and mandatory locks=20
  should interact equivalently as locks, and so where this=20
  document asserts that in a given scenario, a lock by a client A=20
  would conflict with a lock held by a client B, it is is not=20
  considered relevant whether either client's lock is advisory or=20
  mandatory

=E2=80=A2 The mechanism of lock enforcement is to fail the operation=20
  being attempted, a hint shall be sent in the return code of the=20
  reason for failure

=E2=80=A2 An operation which fails due to conflict with an existing lock=20
  fails completely

=E2=80=A2 Mandatory enforcement is taken to mean enforcement, generally,=20
  of write denial in any locked range, including by clients not=20
  observing any locking protocol

=E2=80=A2 Data intended to be written outside any conflicting locked=20
  range on a file with at least one mandatory locked range,=20
  considering the view of locks on the file at the fileserver=20
  when the write request is processed, is not written

=E2=80=A2 Since applications exist, particularly for the command line=20
  (e.g., tar) which know nothing about locks, and may have=20
  legitimate reason to read (though not write) data protected by=20
  mandatory locks, relaxed semantics are enforced for reads by=20
  clients reading outside any range they have themselves=20
  locked--such reads never conflict with lock enforcement--the=20
  view of data provided to such a client shall be whatever is=20
  available, conforming to regular AFS semantics

=E2=80=A2 Mandatory enforcement of a read or write lock is asserted to=20
  govern only the StoreData operation (by other clients), and=20
  not, e.g., the various directory change operations or FetchData[footnot=
e:
Mandatory read lock enforcement is silly, Eisler 2006. More=20
importantly, it causes difficulties for the AFS cache consistency=20
model.
]

2.8.2 Enforcement Rules

=E2=80=A2 If a client A has a mandatory lock of any type on a range R in=20
  a file F, then StoreData operations by any other client B which=20
  would alter data in any overlapping range or truncate F such as=20
  to reduce or eliminate R, the conflicting operation (initiated=20
  by B) fails

3 Delegation

3.1 Dependencies

The delegation feature depends on support for extended callback=20
notifications (and its dependencies) and on byte-range locking=20
support in client and server.

3.2 Backward Compatibility

AFS clients and servers will indicate their support for=20
delegation through new client and file server capability flags:

const CLIENT_CAPABILITY_DELEGATION =3D 0x0010;

const VICED_CAPABILITY_DELEGATION =3D 0x0020;

3.3 Lock Delegation<sub:Lock-Delegation>

The concept of delegation is introduced to prevent the stronger=20
file semantics introduced by the proposed byte-range locking=20
mechanisms from introducing a performance degradation, in the=20
case of a single client making uncontested use of byte-range=20
locks. Since the Windows client (and also, less importantly, the=20
Linux client) currently provide locally-enforced byte-range locks=20
(shadowed by a whole file lock in AFS) to clients requesting=20
them, and since Windows applications in particular (e.g.,=20
Microsoft Office) make extensive use of such locks, this is in=20
fact a common and probably important case.

3.4 File Delegation

In developing the concepts in this proposal and the previously=20
submitted Callback Extended Information proposal we have=20
considered ideas from NFSv4 and other recent systems, such as the=20
(incomplete) CRFS system, and in particular, we have attempted to=20
suggest an evolutionary path for AFS which might provide the=20
stronger file semantics and efficient handling of mutable data=20
that we think a modern distributed file system should=20
provide--while not sacrificing the powerful caching features=20
which make AFS valuable and unique.

Reconsideration of NFSv4 delegation in the light of final drafts=20
of the Callback Extended Information proposal has influenced us=20
to think that a concept of AFS delegation might be developed in=20
which the lock delegation concept suggested in section [sub:Lock-Delegati=
on]
, combined with more deterministic semantics for files primarily=20
under client vs. files primarily under server control, would form=20
the key concepts.

Delegation is the NFSv4 file caching mechanism, and also supports=20
lock delegation. However, delegation has more deterministic=20
semantics in NFSv4 than caching presently has in AFS. Adding an=20
explicit delegation concept to AFS provides an opportunity to=20
tighten the semantics for delegated and undelegated files in AFS.=20
In particular, as in the Extended Callback Information proposal,=20
we are interested in improving AFS cache consistency with respect=20
to mutable data. AFS clients (e.g., the OpenAFS Windows client)=20
are already moving away from traditional AFS sync-on-close=20
behavior, toward a continuous, best-effort sync behavior. The=20
OpenAFS Roadmap contains language, with which we agree,=20
indicating that best-effort synchronisation is actually more=20
efficient than sync-on-close. We propose to formalize this=20
behavior and define it as specified behavior for clients=20
supporting delegation, and operating on a file without an=20
explicit byte-range delegation from the file server.

While clearly related to NFSv4 delegations (and also Oplocks in=20
the Microsoft CIFS protocol), the delegation concept proposed=20
here for AFS differs from NFSv4 delegation. In particular, since=20
the AFS protocol supports caching explicitly through existing=20
protocol mechanisms, the delegation concept is introduced to=20
strengthen AFS caching semantics in specific situations only, and=20
is in no sense new caching mechanism.

NFSv4 supports read and write file delegations, concepts which=20
overlap but are inconsistent with the AFS caching model. An NFSv4=20
read delegation confers permission to cache a file (or byte=20
range, under the byte-range delegation proposal), and carries an=20
assertion that no client has a write delegation. An NFSv4 write=20
delegation confers permission to cache file writes. Since in AFS=20
caching is always permitted, and clients always notified of file=20
changes, an AFS client with a callback on a file by definition=20
always has the equivalent of an NFSv4 read delegation. In our=20
proposal, an AFS delegation somewhat resembles an NFSv4 write=20
delegation. A client with a delegation on a byte range may cache=20
writes in the range, at its discretion, until the delegation is=20
recalled. Read and write operations from contending clients will=20
induce the fileserver to recall overlapping delegations it may=20
have issued in the affected range. The contending operations will=20
not complete until the client whose delegation is being recalled=20
has had an opportunity to flush its changes and return any locks=20
it issued while the delegation was in effect.

NFSv4.1 (May 2008) supports directory delegations[8]. This=20
proposal does not include directory delegation. Experience gained=20
implementing and using AFS file delegations should help to=20
clarify whether directory delegations would be a useful addition=20
in future. (For example, to facilitate implementation of=20
hierarchical server-to-server replication as implemented for=20
NFSv4 in [11].)

Since 2005, an NFSv4 extension to support byte-range delegations=20
has been proposed[9].[footnote:
I do not find evidence in NFSv4.1 Draft 23 that Byte-Range=20
Delegations were included NFSv4.1, but they may be a=20
NetApp-implemented extension.
] The stated motivation for NFSv4 byte-range delegations,=20
supported by analysis of the suggested protocol changes, is to=20
facilitate cache-coherent updates by multiple writers, or writers=20
and readers, on disjoint byte ranges in a file[10]. More=20
specifically, byte-range delegation is an NFSv4 mechanism to=20
permit partial file caching, which AFS has always supported=20
(range-based when using extended callback information), together=20
with a type of range based invalidation.

Thus in the context of NFSv4, byte-range delegation significantly=20
overlaps in function with the general AFS caching model and with=20
extended callback information. Early versions of this proposal=20
defined whole-file delegation only, arguing that this would=20
provide best-effort visibility of changes across clients, with=20
good efficiency, and that it would be sufficient to efficiently=20
support the live multimedia stream example used to motiviate=20
NFSv4 byte-range delegations in [10]. Early reviewers have argued=20
for inclusion of byte-range delegation, in consideration that it=20
is more expressive (not a whole-file caching hack) and would be=20
desireable for applications such as distributed databases or HPC=20
applications. Correspondingly, the current proposal now includes=20
a byte range delegation concept. Clients iteratively and=20
aggressively updating or locking in disjoint ranges of a file=20
would be eligible to operate in disjoint, byte-range delegations.=20
Further feedback from reviewers is requested. Feedback on=20
specific applications and usage models we should support would be=20
especially helpful.

3.4.1 Semantic Changes

For AFS, I suggest the following semantics and supporting=20
mechanisms for delegation:

=E2=80=A2 only files may be delegated

=E2=80=A2 with respect to file data, a file delegation, if mutually=20
  accepted in client and file server, shall indicate a=20
  strengthened semantics for file caching such that

  =E2=80=93 a byte-range under delegation shall be regarded as under=20
    exclusive control of one client, which may then observe any=20
    synchronisation/flush semantics on the range for the duration=20
    of the delegation

  =E2=80=93 a byte range not under delegation shall be regarded as under=20
    server control, potentially shared by multiple readers and/or=20
    writers, such that clients must observe more strict=20
    synchronisation/flush semantics, defined to mean an=20
    obligation to flush changes continuously at best effort, with=20
    the special exception that=20

  =E2=80=93 revocation of a file delegation shall obligate the client to=20
    whom file was formerly delegated to store any data changed=20
    during the period of delegation, and to ``return'' the=20
    now-resynchronised byte range to the file server using its=20
    UndelegateReturningLocks procedure, within a time window=20
    provided by the server in its AFSCB_Cancel_RevokeDelegation=20
    callback cancellation message

=E2=80=A2 with respect to byte-range locks,=20

  =E2=80=93 a byte range under delegation shall be regarded as under=20
    exclusive control of one client, which may then issue=20
    byte-range locks of any type within the range, without=20
    consideration of the file server

  =E2=80=93 a byte range not under delegation shall be regarded as under=20
    server control, such that all locking requests must be=20
    executed at the file server, using the interfaces defined in=20
    this proposal

  =E2=80=93 recall of a byte range delegation shall obligate the client=20
    to whom file was formerly delegated to ``return'' the=20
    now-resynchronised byte range and all issued locks to the=20
    file server using its UndelegateReturningLocks procedure,=20
    within the time window provided by the server in its=20
    AFSCB_Cancel_RevokeDelegation callback cancellation message

3.4.2 Delegation

It is reasonable for a file server to issue a byte range=20
delegation in response to any of several file operations on a=20
byte range which is not already delegated, and which is not known=20
to be unsuitable for delegation for operational reasons. This=20
proposal assumes that, presuming a client and server are mutually=20
capable of delegation, the general behavior of the file server=20
should be to issue a delegation if no rule or heuristic would=20
prevent it.=20

=E2=80=A2 The file server MAY issue a delegation in response to any of=20
  the FetchData or StoreData operations

=E2=80=A2 The file server MUST NOT issue a delegation for any byte range=20
  for which there is an existing delegation--and in fact, in any=20
  case where it might do so, it MUST recall the conflicting=20
  delegation(s) (see section [sub:Revocation]).

=E2=80=A2 The file server SHOULD NOT issue a delegation if it has=20
  heuristic information that would suggest delegating a=20
  particular byte range would be inefficient, i.e., because a=20
  given file is frequently operated on by a variety of clients

=E2=80=A2 The file server SHOULD NOT issue a delegation in response to=20
  FetchStatus operations in absence of other supporting=20
  information, as these are commonly issued to clients scanning=20
  directories

It is expected that clients may request, or that a file server=20
may offer clients, a delegation on a range larger than the=20
smallest range compatible with the file operation or explicit=20
request which triggered the delegation.

3.4.3 Revocation<sub:Revocation>

A file server may recall file delegations at any time, for any=20
reason. A file server must recall file delegations when a client=20
other than the one to which a delegation has been issued performs=20
any of the following operations on the file:

=E2=80=A2 StoreData operations

=E2=80=A2 FetchData operations

=E2=80=A2 any other operation that strongly indicates liklihood of intent=
=20
  to read or alter file contents (e.g., any Open indication,=20
  should it be added to the AFS protocol)

When the file server wishes to recall a file delegation, it=20
issues an AFSCB_Cancel_RevokeDelegation notification to the=20
client via the ExtendedCallback interface. Alternatively, it may=20
send AFSCB_FlagRevokeDelegation to any other ExtendedCallback=20
notification message.

3.5 Constants

3.5.1 Delegation Types

The current proposal defines one delegation type. The possibility=20
to define new delegation types, with new semantics, is provided=20
for potential future proposals.

const AFS_DType_General =3D 0;

  AFS_DType_General

Represents general delegation as defined in this proposal.

3.5.2 Callback Constants

The following extended callback event type is added:

const AFSCB_Event_Delegation =3D 13;

The following callback cancellation types and flags are provided,=20
to permit management of delegations through the ExtendedCallback=20
interface:

const AFSCB_Flag_Delegation =3D 2; /* file delegation */

const AFSCB_Cancel_RevokeDelegation =3D 9; /* delegation is revoked=20
*/

const AFSCB_Flag_RevokeDelegation =3D 16; /* delegation is revoked=20
*/

The following constant is provided as a descriminator for the=20
AFSCB_ResultData member of AFSCBExtendedCallbackResult allowing=20
clients to indicate their intention to defer returning locks or=20
delegations in a subsequent RPC on the file server:

const AFSCB_Result_ResponseDeferred =3D 2;

The following constant is provided as a descriminator for the=20
AFSCB_ResultData member of AFSCBExtendedCallbackResult allowing=20
clients to indicate their intention to return locks in the=20
CallBack_Result_Array OUT parameter:

const AFSCB_Result_ReturnLocks =3D 3;

  AFSCB_Flag_Delegation

When set in the Flags member of an AFSExtendedCallback structure,=20
indicates that the callback promise includes file delegation. The=20
delegation persists for the life of the callback, unless recalled=20
through an ExtendedCalback notification.

  AFSCB_Cancel_RevokeDelegation

When sent as the reason for cancellation in an ExtendedCallback=20
notification, indicates that the file delegation on FID has been=20
recalled. The client MUST store all data in FID which has changed=20
during the period of delegation, and then execute the file=20
server's UndelegateReturningLocks procedure for all locks it=20
asserts on FID, prior to the ExpirationTime in the extended=20
callback message.

  AFSCB_Flag_RevokeDelegation

Has the same meaning and effect as AFSCB_Cancel_RevokeDelegation,=20
but may be sent with an arbitrary extended callback message.

  AFSCB_Flag_ExtremePrejudice

Combined with AFSCB_Flag_RevokeDelegation, indicates that the=20
resync/lock return period for an already-recalled delegation is=20
over. The client is requested to stop lock-return activity.

3.6 DataTypes

3.6.1 AFSDelegation

The AFSDelegation data type represents a delegation issued by a=20
fileserver to some client on a specific byte range in Fid.

struct AFSDelegation {

    AFSFid Fid;

    afs_uint32 Type;

    afs_uint32 Flags;

    afs_uint64 Offset;

    afs_uint64 Length;

    afs_uint64 ExpirationTime;

};

  Fid

The Fid being delegated.

  Type

The type of the delegation, currently restricted to=20
AFS_DType_General.

  Flags

An array of flag values provided for future extension.

  Offset

The starting offset of the byte range being delegated.

  Length

The length of the byte range being delegated.

  ExpirationTime

Time in seconds since the Epoch after which the delegation must=20
be considered invalid. A server implementation MAY offer a new=20
AFSDelegation effectively extending the expiration time of an=20
existing delegation at any convenient time. (Clients may also=20
request a new delegation explicitly using the RequestDelegation=20
interface prior to ExpirationTime to request an extension.)

3.6.2 AFSExtendedCallBack

A new value of AFSCB_Event_Delegation is added to union=20
AFSCB_NotificationData used in struct AFSExtendedCallBack. The=20
type of the union at AFSCB_EventDelegation is AFSDelegation. The=20
new extended callback notification is used by the file server to=20
indicate it has granted a file delegation on FID to client.

3.7 Procedures

3.7.1 RequestDelegation

The RequestDelegation procedure is added to the fileserver=20
interface, permitting the client to request an explicit=20
delegation on a byte range. A client implementation MAY chose to=20
make an explicit delegation request based on a client application=20
fadvise or madvise API call, or similar mechanism appropriate to=20
its platform.

/* Request explicit delegation of a byte range */

RequestDelegation(

    IN AFSFid Fid,

        afs_uint32 Type,

        afs_uint32 Flags,

        afs_uint64 Offset,

        afs_uint64 Length,

        AFSDelegation *Delegation

) =3D 65608;

  Fid

The Fid being delegated.

  Type

The type of the delegation, currently restricted to=20
AFS_DType_General.

  Flags

An array of flag values provided for future extension.

  Offset

The starting offset of the byte range being delegated.

  Length

The length of the byte range being delegated.

  Delegation

The Delegation returned from the fileserver, if granted.

  Error Codes

  EACCES

The caller does not have the necessary rights.

  EWOULDBLOCK

The server is unable to grant the request due to conflicting=20
delegation.=20

  EINVAL

An illegal delegation type or range was specified.

3.7.2 UndelegateReturningLocks

The UndelegateReturningLocks bulk call MUST be executed by=20
clients on receipt of an AFSCB_Cancel_RevokeDelegation or=20
AFSCB_Flag_RevokeDelegation notification through the extended=20
callback interface. The call must be executed before the=20
ExpirationTime in the AFSExtendedCallback structure sent with the=20
callbck. Fid is the file for which locks are being extended.=20
Flags contains indication of special semantics (e.g., mandatory=20
enforcement) being asserted, if any. AssertedLocks_Array points=20
to a variable length array of AFSByteRangeLock structures the=20
client asserts to hold. At the completion of the call, parallel=20
array OutResult indicates the server's confirmation (or refusal)=20
to assert each returned lock after undelegation--a value of=20
(Flags & AFSLock_Flag_Undelegate_Ok) indicates confirmation.

/* Confirm undelegation and req. assert locks, if any */

UndelegateReturnLocks(

    IN AFSFid Fid,

        afs_uint32 Flags,

        AFSByteRangeLockSeq *AssertedLocks_Array,

    OUT AFSLockFlagsSeq *OutResult

) =3D 65609;

4 Appendix A: XDR Grammar (afsint.xg)

const VICED_CAPABILITY_BYTE_RANGE_LOCK =3D 0x0010;

const VICED_CAPABILITY_DELEGATION =3D 0x0020;



const AFSLock_Flag_Mand =3D 1; /* req. enforcement */

const AFSLock_Flag_Wait =3D 2; /* req. wait on lock */



const AFS_DType_General =3D 0;

const AFSCB_Event_Delegation =3D 13;



struct AFSByteRangeLock {

    AFSFid Fid;

    afs_uint32 Type;

    afs_uint32 Flags;

    afs_uint32 Owner;

    afs_uint32 Uniq;

    afs_uint64 Offset;

    afs_uint64 Length;

    afs_uint64 ExpirationTime;

};



struct AFSDelegation {

    AFSFid Fid;

    afs_uint32 Type;

    afs_uint32 Flags;

    afs_uint64 Offset;

    afs_uint64 Length;

    afs_uint64 ExpirationTime;

};



/* Request byte-range file lock */

proc SetByteRangeLock(

    IN AFSFid *Fid,

        afs_uint32 Type,

        afs_uint32 Flags,

        afs_uint32 Owner,

        afs_uint32 Uniq,

        afs_uint64 Offset,

        afs_uint64 Length,

    OUT AFSByteRangeLock *Lock

) =3D 65601;



/* Release byte-range file lock */

proc ReleaseByteRangeLock(

    IN AFSByteRangeLock *Lock

) =3D 65602;



/* Upgrade byte-range file lock (i.e., from Read to Write) */

proc UpgradeByteRangeLock(

    IN AFSByteRangeLock *Lock,

    afs_uint32 Type

) =3D 65603;



/* Downgrade byte-range file lock (i.e., from Write to Read) */

proc DowngradeByteRangeLock(

    IN AFSByteRangeLock *Lock,

    afs_uint32 Type

) =3D 65604;



/* Request lock status report (system:administrators) */

proc GetByteRangeLockStatus(

    IN Fid,

    OUT AFSByteRangeLockSeq *AssertedLocks_Array,

        AFSLockHostIdentifierSeq *Clients_Array

) =3D 65605;



/* administratively cancel locks (system:administrators) */

proc CancelByteRangeLocks(

    IN Fid,

       afs_uint64 Offset,

       afs_uint64 Length

) =3D 65606;



const AFS_LOCK_SEQ_MAX =3D 10000;

typedef AFSByteRangeLock AFSByteRangeLockSeq <AFS_LOCK_SEQ_MAX>;

typedef AFSLockFlagsSeq <AFS_LOCK_SEQ_MAX>;



const AFSLock_Flag_Extend_Ok =3D 4; /* extended */

const AFSLock_Flag_Undelegate_Ok =3D 8; /* undelegated, asserted */



/* Assert locks on Fid, on request */

AssertExtendLocks(

    IN AFSFid Fid,

        afs_uint32 Flags,

        AFSByteRangeLockSeq *AssertedLocks_Array,

    OUT AFSLockFlagsSeq *OutResult

) =3D 65607;



/* Request explicit delegation of a byte range */

RequestDelegation(

    IN AFSFid Fid,

        afs_uint32 Type,

        afs_uint32 Flags,

        afs_uint64 Offset,

        afs_uint64 Length,

        AFSDelegation *Delegation

) =3D 65608;



/* Confirm undelegation and req. assert locks, if any */

UndelegateReturnLocks(

    IN AFSFid Fid,

        afs_uint32 Flags,

        AFSByteRangeLockSeq *AssertedLocks_Array

) =3D 65609;

5 Appendix A: XDR Grammar (afscbint.xg)

const CLIENT_CAPABILITY_BYTE_RANGE_LOCK =3D 0x0008;

const CLIENT_CAPABILITY_DELEGATION =3D 0x008;



/* Revoke-Delegation Cancellation Type */

const AFSCB_Cancel_ExtendLocks =3D 7; /* re-assert locks, or lose=20
them */

const AFSCB_Cancel_RevokeLocks =3D 8; /* locks on Fid revoked */

const AFSCB_Cancel_RevokeDelegation =3D 9; /* delegation is revoked=20
*/



/* Delegation Callback Flag */

const AFSCB_Flag_Delegation =3D 2; /* file delegation */



/* Cancellation Flags */

const AFSCB_Flag_AssertLocks =3D 4; /* request ExtendLock */



const AFSCB_Flag_RevokeLocks =3D 8; /* locks cancelled, sorry */



const AFSCB_Flag_RevokeDelegation =3D 16; /* delegation is revoked=20
*/



/* confirm issue of deferred lock requests */

proc AsyncIssueByteRangeLock(

    IN HostIdentifier *Server,

       AFSByteRangeLockSeq <AFS_LOCK_SEQ_MAX>

) =3D 65540;



/* extended callback expansion for delegation */



struct AFSCB_Data_Delegation {

    AFSFid Fid;

    afs_uint32 Flags;

    afs_uint64 Offset;

    afs_uint64 Length;

    afs_uint64 ExpirationTime;

};



union AFSCB_NotificationData switch (afs_uint32 Event_Type) {

case AFSCB_Event_StoreData:

    AFSCB_Data_StoreData u_store_data;

case AFSCB_Event_StoreACL:

    void;

case AFSCB_Event_StoreStatus:

    AFSCB_Data_StoreStatus u_store_status;

case AFSCB_Event_CreateFile:

    AFSCB_Data_CreateFile u_create_file;

case AFSCB_Event_MakeDir:

    AFSCB_Data_MakeDir u_make_dir;

case AFSCB_Event_Symlink:

    AFSCB_Data_Symlink u_symlink;

case AFSCB_Event_Link:

    AFSCB_Data_Link u_link;

case AFSCB_Event_RemoveFile:

    AFSCB_Data_RemoveFile u_remove_file;

case AFSCB_Event_RemoveDir:

    AFSCB_Data_RemoveDir u_remove_dir;

case AFSCB_Event_Rename:

    AFSCB_Data_Rename u_rename;

case AFSCB_Event_Deleted:

    void;

case AFSCB_Event_ReleaseLock:

    AFSCB_Data_Lock u_lock;

case AFSCB_Event_Cancel:

    void;

case AFSCB_Event_Delegation:

    AFSCB_Data_Delegation u_delegation;

};

References

[1] Bradner, S., "Key words for use in RFCs to Indicate=20
Requirement Levels", BCP 14, RFC 2119, March 1997.

[2] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., Beame,=20
C., Eisler, M., and D. Noveck, "Network File System (NFS) version=20
4 Protocol", RFC 3530, April 2003.

[3] Edward R Zayas, "AFS-3 Programmer's Reference: File=20
Server/Cache Manager Interface", Transarc Corporation,=20
FS-00-D162, 20th August 1991

[4] Paul J. Leach, Dilip C. Naik. A Common Internet File System=20
(CIFS/1.0) Protocol=20
[http://www.tools.ietf.org/html/draft-leach-cifs-v1-spec-01],=20
1997.

[5] Jake Edge. CRFS and POHMELFS=20
[http://lwn.net/Articles/267896/].

[6] OpenAFS Roadmap [http://openafs.org/roadmap.html].

[7] S. Shepler, M. Eisler, D. Noveck. NFS Version 4 Minor Version=20
1=20
[http://www.ietf.org/internet-drafts/draft-ietf-nfsv4-minorversion1-23.tx=
t],=20
May 2008.

[8] T. Myklebust, J. Fields, W. Adamson, P. Honeyman. Network=20
File System (NFS) version 4 byte range delegations=20
[http://tools.ietf.org/html/draft-myklebust-nfsv4-byte-range-delegations-=
00],=20
October 2005.

[9] Trond Myklebust. Byte Range Delegations.=20
[https://www3.ietf.org/proceedings/05nov/slides/nfsv4-3.pdf ],=20
November 2006.

[10] Jiaying Zhang and Peter Honeyman, "Reliable Replication at=20
Low Cost," CITI Technical Report 06-2, January 2006.=20


--------------050102070305060407030308--