OpenAFS Master Repository branch, master, updated. openafs-devel-1_5_76-4584-g28f9712

Gerrit Code Review
Wed, 11 Feb 2015 09:06:00 -0500

The following commit has been merged in the master branch:
commit 28f9712b4b1c615e5d0b565fbcaa828b559bff4a
Author: Andrew Deason <>
Date:   Thu Jan 30 00:40:57 2014 -0600

    rx: Remove RX_CALL_BUSY
    Commit 23d6287f7f494383891a497038e8c0e870e824bf introduced the
    behavior where a client can immediately retry a call if it receives a
    "busy" packet from the server (meaning, the call channel is already in
    use). This happened via Rx returning the error code RX_CALL_BUSY, and
    the caller was supposed to immediately retry the call, so Rx could
    reissue the RPC on a different call channel.
    However, this behavior makes it more likely for the server to process
    an RPC that the client thinks has not been processed. Say the client
    issues an RPC, the server replies with a "busy" packet, and the client
    resends the original packet before it sees the "busy" packet. In
    this case, the server will get the resent packet for the RPC request
    and process it, but the client will think the call has failed (and
    presumably will retry the call on a new channel). For calls that are
    non-idempotent (e.g. MakeDir), this can result in incorrect errors
    (e.g. EEXIST) as well as incorrect cache state in the client.
    There may be some ways to mitigate at least some of the problems here,
    but this kind of "instant" retry behavior is often not really that
    helpful. Calls that take a very long time to run on the server are
    very rare (and usually indicate some other problem), while the
    occasional short-lived "busy" packet is relatively common (sometimes
    the server just hasn't cleaned up the call by the time we issue a new
    call). So just get rid of the retrying behavior to ensure we don't
    continue to encounter any problems like this.
    To get rid of this behavior, we remove the RX_CALL_BUSY code, and all
    code dealing with processing it. This means removing the RX_CALL_BUSY
    handling from the client, as well as removing
    rx_SetBusyChannelError(). This effectively reverts most of
    23d6287f7f494383891a497038e8c0e870e824bf, and a few other commits
    related to RX_CALL_BUSY.
    With this change, if all we get from the server are BUSY packets when
    we try to issue an RPC, the call will eventually error out with
    RX_CALL_TIMEOUT (or hang forever, if no timeouts are configured). This
    can be thought of intuitively as similar to "idle dead" behavior,
    since we are just waiting for the server to proceed with processing
    the call. So, if "idle dead" is configured, we still timeout after the
    "idle dead" timeout. And if no idle or hard dead timeout is
    configured, we will hang forever; just like if the server started
    processing the call but then hangs forever.
    Note that not all of 23d6287f7f494383891a497038e8c0e870e824bf is
    reverted. Namely, the logic to have rx_NewCall try to pick the "least
    busy" channel is retained.
    Thanks to Simon Wilkinson for bringing up and discussing this issue in
    this thread:
    Change-Id: I272e51f252356aa14bc4b8a3b7c594700deb432c
    Reviewed-by: Daria Brashear <>
    Reviewed-by: Benjamin Kaduk <>
    Tested-by: BuildBot <>
    Reviewed-by: Jeffrey Altman <>

 src/WINNT/afsd/afsd_init.c     |    2 -
 src/WINNT/afsd/cm_conn.c       |   21 -----
 src/WINNT/afsd/cm_server.c     |   13 +--
 src/WINNT/afsd/cm_utils.c      |    3 -
 src/afs/afs_analyze.c          |    2 +-
 src/afs/afs_call.c             |    2 -
 src/afs/afs_pag_call.c         |    1 -
 src/libafsrpc/afsrpc.def       |    2 +-
 src/libafsrpc/afsrpc.exp       |    1 -
 src/libafsrpc/ |    1 -
 src/rx/rx.c                    |  169 +++++-----------------------------------
 src/rx/rx.h                    |   10 +--
 src/rx/rx_prototypes.h         |    1 -
 13 files changed, 30 insertions(+), 198 deletions(-)

OpenAFS Master Repository