[OpenAFS] Need details of callback mechanism -- questions ...

Dexter 'Kim' Kimball dhk@ccre.com
Thu, 1 Sep 2005 10:27:46 -0600

Debugging what looked like a callback related race condition has ended =
challenging my beliefs about callbacks.

IIRC when a fileserver attempts a BCB to client X and the BCB fails to
elicit a response, the fileserver periodically retries the BCB.  When I =
teaching for Transarc we taught something like ... the fileserver =
the BCB immediately, then puts the client on a 3 second list, then 10 =
list, then one minute list.  Eventually the fileserver decides that the
client has been unreachable for so long it instructs the client, on next
contact, to mark all callbacks from <fileserver> as "iffy," causing the
client to RPC the fileserver to check AFS version numbers on cached =
If the version numbers are the same then the client updates the callback
state and no data transfer occurs.  OTOH if the file has been changed =
the version numbers are different, the client receives data and then =
the callback state.

I'm looking for definitive answers to the following.  Assume RW volumes

1.  When a fileserver sends a BCB to a given client, does it wait for a
response or does it send the BCB and handle responses asynchronously?  I
believe it used to wait for a response and that it no longer does so.

2.  When does the fileserver begin sending the BCBs?
    a. When it begins to modify a given file -- i.e when it receives the
write RPC and before (or simultaneously with) storing the first few =
    b. When it has written the first bytes to a given file -- i.e. after =
has stored x bytes but before receiving a "close" from the client.
    c. When it receives the close file RPC.
3.  If the fileserver attempts a BCB to client X and gets no response =
fails on X), does it:
    a. Retry immediately.
    b. Wait some period of time before attempting the BCB again.
    c. (a) then (b)
4.  What is the current fileserver BCB retry scheme?

Any info much appreciated.


Kim (Dexter) Kimball
CCRE, Inc.