[OpenAFS-win32-devel] New client-side locking implementation

Eric Williams ericjw@citi.umich.edu
Wed, 10 Aug 2005 15:12:31 -0400 (EDT)


i agree with almost all of this, except where noted.

On Mon, 8 Aug 2005 asanka@secure-endpoints.com wrote:

>
> Updated design is below.
>
> /* Byte range locks:
>
>    The OpenAFS Windows client has to fake byte range locks given no
>    server side support for such locks.  This is implemented as keyed
>    byte range locks on the cache manager.
>
>    Keyed byte range locks:
>
>    Each cm_scache_t structure keeps track of a list of keyed locks.
>    The key for a lock is essentially a token which identifies an owner
>    of a set of locks (referred to as a client).  The set of keys used
>    within a specific cm_scache_t structure form a namespace that has a
>    scope of just that cm_scache_t structure.  The same key value can
>    be used with another cm_scache_t structure and correspond to a
>    completely different client.  However it is advantageous for the
>    SMB or IFS layer to make sure that there is a 1-1 mapping between
>    client and keys irrespective of the cm_scache_t.
>
>    Assume a client C has key Key(C) (although, since the scope of the
>    key is a cm_scache_t, the key can be Key(C,S), where S is the
>    cm_scache_t.  But assume a 1-1 relation between keys and clients).
>    A byte range (O,+L) denotes byte addresses (O) through (O+L-1)
>    inclusive (a.k.a. [O,O+L-1]).  The function Key(x) is implemented
>    through cm_generateKey() function for both SMB and IFS.
>
>    The cache manager will set a lock on the AFS file server in order
>    to assert the locks in S->fileLocks.  If only shared locks are in
>    place for S, then the cache manager will obtain a LockRead lock,
>    while if there are any exclusive locks, it will obtain a LockWrite
>    lock.  If the exclusive locks are all released while the shared
>    locks remain, then the cache manager will downgrade the lock from
>    LockWrite to LockRead.
>
>    Lock states:
>
>    A lock exists iff it is in S->fileLocks for some cm_scache_t
>    S. Existing locks are in one of the following states: ACTIVE,
>    WAITLOCK, WAITUNLOCK, LOST, DELETED.
>
>    The following sections describe each lock and the associated
>    transitions.
>
>    1. ACTIVE: A lock L is ACTIVE iff the cache manager has asserted
>       the lock with the AFS file server.  This type of lock can be
>       exercised by a client to read or write to the locked region (as
>       the lock allows).
>
>       1.1 ACTIVE->LOST: When the AFS file server fails to extend a
>         server lock that was required to assert the lock.
>
>       1.2 ACTIVE->DELETED: Lock is released.
>
>    2. WAITLOCK: A lock is in a WAITLOCK state if the cache manager
>       grants the lock but the lock is yet to be asserted with the AFS
>       file server.  Once the file server grants the lock, the state
>       will transition to an ACTIVE lock.
>
>       2.1 WAITLOCK->ACTIVE: The server granted the lock.
>
>       2.2 WAITLOCK->DELETED: Lock is abandoned, or timed out during
>         waiting.
>
>       2.3 WAITLOCK->LOST: One or more locks from this client were
>         marked as LOST.  No further locks will be granted to this
>         client until al lost locks are removed.
>
>    3. WAITUNLOCK: A lock is in a WAITUNLOCK state if the cache manager
>       receives a request for a lock that conflicts with an existing
>       ACTIVE or WAITLOCK lock.  The lock will be placed in the queue
>       and will be granted at such time the conflicting locks are
>       removed, at which point the state will transition to either
>       WAITLOCK or ACTIVE.
>
>       3.1 WAITUNLOCK->ACTIVE: The conflicting lock was removed.  The
>         current serverLock is sufficient to assert this lock, or a
>         sufficient serverLock is obtained.
>
>       3.2 WAITUNLOCK->WAITLOCK: The conflicting lock was removed,
>         however the required serverLock is yet to be asserted with the
>         server.
>
>       3.3 WAITUNLOCK->DELETED: The lock is abandoned or timed out.
>
>       3.5 WAITUNLOCK->LOST: One or more locks from this client were
>         marked as LOST.  No further locks will be granted to this
>         client until all lost locks are removed.
>
>    4. LOST: A lock L is LOST if the server lock that was required to
>       assert the lock could not be obtained or if it could not be
>       extended, or if other locks by the same client were LOST.
>       Effectively, once a lock is LOST, the contract between the cache
>       manager and that specific client is no longer valid.
>
>       The cache manager rechecks the server lock once every minute and
>       extends it as appropriate.  If this is not done for 5 minutes,
>       the AFS file server will release the lock.  Once released, the
>       lock cannot be re-obtained without verifying that the contents
>       of the file hasn't been modified since the time the lock was
>       released.  Doing so may cause data corruption.
>
>       4.1 LOST->DELETED: The lock is released.
>
>       4.2 LOST->ACTIVE: The lock is reassertd.  This requires
>         verifying that the file was not modified in between.

Jeff had originally proposed allowing no file access once a lock was lost,
and before the file was re-asserted:

	"(2) if the lock was dropped but the callback is not valid that
	means the file has potentially changed and we can no longer assume
	the validity of the data.  at this point we mark the lock as being
	lost and prevent any further reads from or writes to the file
	until the file is closed and re-opened."

in this document, however, you seem to propose allowing i/o for the client
in all regions except those that have a lost lock for the same client.

i think either is probably fine, but i would change 4.2 to include:

	"verifying that the file range was not modified in between."
		OR
	"verifying that all locked ranges were not modified in between."

because this is what a lock guarantees.  file reassertion must be an
atomic operation, so these are equivalent.  however, i do understand that
some applications may lock a certain range of the file, and use this as a
proxy lock for the entire file.  if this is a concern, then i would do as
jeff proposed, and then leave 4.2 untouched.

>       4.3 LOST->WAITLOCK: All LOST ACTIVE locks from this client were
>         reasserted.  The cache manager can reinstate this waiting
>         lock.
>
>       4.4 LOST->WAITUNLOCK: All LOST ACTIVE locks from this client
>         were reasserted.  The cache manager can reinstate this waiting
>         lock.
>
>    5. DELETED: The lock is no longer relevant.  Eventually, it will
>       get removed from the cm_scache_t. In the meantime, it will be
>       treated as if it does not exist.
>
>       5.1 DELETED->not exist: The lock is removed from the
>         cm_scache_t.
>
>    6* A lock L is ACCEPTED if it is ACTIVE or WAITLOCK.
>       These locks have been accepted by the cache manager, but may or
>       may not have been granted back to the client.
>
>    7* A lock L is QUEUED if it is ACTIVE, WAITLOCK or WAITUNLOCK.
>
>    8* A lock L is EFFECTIVE if it is ACTIVE or LOST.
>
>    9* A lock L is WAITING if it is WAITLOCK or WAITUNLOCK.
>
>    Lock operation:
>
>    A client C can READ range (Offset,+Length) of cm_scache_t S iff:
>
>    1. for all _a_ in (Offset,+Length), one of the following is true:
>
>        1.1 There does NOT exist an ACTIVE lock L in S->fileLocks such
>          that _a_ in (L->LOffset,+L->LLength) (IOW: byte _a_ of S is
>          unowned)
>
>          AND
>
>          For each LOST lock M in S->fileLocks such that
>          _a_ in (M->LOffset,+M->LLength), M->LockType is shared AND
>          M->key != Key(C).
>
>          (Note: If this is a different client from one whose shared
>          lock was LOST, then the contract between this client and the
>          cache manager is indistinguishable from that where no lock
>          was lost.  If an exclusive lock was lost, then the range is
>          considered unsafe for consumption.)
>
>        1.3 There is an ACTIVE lock L in S->fileLocks such that: L->key
>          == Key(C) && _a_ in (L->LOffset,+L->LLength) (IOW: byte _a_
>          of S is owned by C under lock L)
>
>        1.4 There is an ACTIVE lock L in S->fileLocks such that _a_ in
>          (L->LOffset,L->+LLength) && L->LockType is shared (IOW: byte
>          _a_ of S is shared) AND there is no LOST lock M such that _a_
>          in (M->LOffset,+M->LLength) and M->key == Key(C)
>
>    A client C can WRITE range (Offset,+Length) of cm_scache_t S iff:
>
>    2. for all _a_ in (Offset,+Length), one of the following is true:
>
>        2.1 Byte _a_ of S is unowned (as above)

	"AND there is no lost exclusive lock on this range, AND
	there is no lost shared lock owned by this client."

>        2.2 Byte _a_ of S is owned by C under lock L (as above) AND
>          L->LockType is exclusive.
>
>    A client C can OBTAIN a lock L on cm_scache_t S iff:

OBTAIN taken to mean a lock may be created in the ACTIVE or WAITLOCK
state?  otherwise, the lock will be pending in the WAITUNLOCK state?

>    3. for all _a_ in (L->LOffset,+L->LLength), ALL of the following is
>       true:
>
>        3.1 L->LockType is exclusive IMPLIES there does NOT exist a QUEUED
> lock
>          M in S->fileLocks such that _a_ in (M->LOffset,+M->LLength).
>
>          (Note: If we count all QUEUED locks then we hit cases such as
>          cascading waiting locks where the locks later on in the queue
>          can be granted without compromising file integrity.  On the
>          other hand if only ACCEPTED locks are considered, then locks
>          that were received earlier may end up waiting for locks that
>          were received later to be unlocked. The choice of QUEUED
>          locks were made so that large locks don't consistently get
>          trumped by smaller locks which were requested later.)
>
>        3.2 L->LockType is shared IMPLIES for each QUEUED lock M in
>          S->fileLocks, if _a_ in (M->LOffset,+M->LLength) then
>          M->LockType is shared.
>
>    4. For each LOST lock M in S->fileLocks, M->key != Key(C)
>
>          (Note: If a client loses a lock, it loses all locks.
>          Subsequently, it will not be allowed to obtain any more locks
>          until all existing LOST locks that belong to the client are
>          released.  Once all locks are released by a single client,
>          there exists no further contract between the client and AFS
>          about the contents of the file, hence the client can then
>          proceed to obtain new locks and establish a new contract.)
>
>    A client C can only unlock locks L in S->fileLocks which have
>    L->key == Key(C).
>
>    The representation and invariants are as follows:
>
>    - Each cm_scache_t structure keeps:
>
>        - A queue of byte-range locks (cm_scache_t::fileLocks) which
>          are of type cm_file_lock_t.
>
>        - A record of the highest server-side lock that has been
>          obtained for this object (cm_scache_t::serverLock), which is
>          one of (-1), LockRead, LockWrite.
>
>        - A count of VALID exclusive and shared locks that are in the
>          queue (cm_scache_t::sharedLocks and
>          cm_scache_t::exclusiveLocks)

        - Time of issuance or last successful extension

instead of in the lock structure, since the server grants locks
effectively per cm_scache_t.

>    - Each cm_file_lock_t structure keeps:
>
>        - The type of lock (cm_file_lock_t::LockType)
>
>        - The key associated with the lock (cm_file_lock_t::key)
>
>        - The offset and length of the lock (cm_file_lock_t::LOffset
>          and cm_file_lock_t::LLength)
>
>        - The state of the lock.
>
>        - Time of issuance or last successful extension
>
>    Semantic invariants:
>
>        I1. The number of VALID locks in S->fileLocks are
>            (S->sharedLocks + S->exclusiveLocks)
>
>        I2. If L1 and L2 are both VALID locks in S->fileLocks, then L1
>            and L2 do NOT overlap. (enforced by 3.1 above)
>
>    External invariants:
>
>        I3. S->serverLock is the lock that we have asserted with the
>            AFS file server for this cm_scache_t.
>
>        I4. S->serverLock == LockRead iff there is at least one ACTIVE
>            shared lock, but no ACTIVE exclusive locks.
>
>        I5. S->serverLock == LockWrite iff there is at least one ACTIVE
>            exclusive lock.
>
>        I6. If a WAITUNLOCK lock L exists in S->fileLocks, then all
>            locks that L is waiting on are ahead of L in S->fileLocks.
>
>        I7. If L is a LOST lock, then for each lock M in S->fileLocks,
>            M->key == L->key IMPLIES M is LOST or DELETED.
>
>    --asanka
>  */

further comments?

eric