OpenAFS Master Repository branch, master, updated. openafs-devel-1_5_76-4581-g7acac94

Wed, 11 Feb 2015 09:04:31 -0500

The following commit has been merged in the master branch:
commit 7acac948fcda0e772326a26ad24481ccf1ae31ef
Author: Andrew Deason <adeason@sinenomine.net>
Date:   Mon Jan 27 00:36:14 2014 -0600

    rx: Rely on remote startWait idleness for idleDead
    
    This commit removes the functionality introduced in
    c26dc0e6aaefedc55ed5c35a5744b5c01ba39ea1 (which is also modified by a
    few later commits), as well as
    05f3a0d1e0359f604cc6162708f3f381eabcd1d7. Instead we modify the
    startWait check in rxi_CheckCall to apply to both "reading" and
    "writing" to enforce "idle dead" timeouts.
    
    Why do this? First, let's start out with the following:
    
    If an Rx call gets permanently "stuck", what happens? What should
    happen?
    
    Here, "stuck" means that either the server or client hangs while
    processing the call. The server or client is waiting for something to
    complete before it issues the next rx_Read() or rx_Write() call. In
    various situations over the years, this has happened because the
    server or client is waiting for a lock, waiting for local disk I/O to
    complete, or waiting for some other arbitrary event to occur.
    
    Currently, what happens with such a "hanging" call is a little
    complex, and has several different results in different situations.
    The behavior of a call in this "stuck" situation is handled by the
    "idle dead" timeout of an Rx call/connection. This timeout is enforced
    in rxi_CheckCall, in two different conditionals (if an "idle dead"
    timeout is configured):
    
        if (call->startWait && ((call->startWait + idleDeadTime) < now) &&
            (call->flags & RX_CALL_READER_WAIT)) {
            if (call->state == RX_STATE_ACTIVE) {
                cerror = RX_CALL_TIMEOUT;
                goto mtuout;
            }
        }
    
    and
    
        if (call->lastSendData && ((call->lastSendData + idleDeadTime) < now)) {
            if (call->state == RX_STATE_ACTIVE) {
                cerror = conn->service ? conn->service->idleDeadErr : RX_CALL_IDLE;
                idle_timeout = 1;
                goto mtuout;
            }
        }
    
    The first of these handles the case where we are waiting to rx_Read()
    from a call for too long (the other side of the call needs to give us
    more data). The second handles the case where we are waiting to
    rx_Write() for too long (the other side of the call needs to read some
    of the data we sent previously).
    
    This second case was added by commit
    c26dc0e6aaefedc55ed5c35a5744b5c01ba39ea1, but it has the general
    problem that this check does not check if anyone is actually trying to
    write to the call, and just tries to keep track of the last time we
    wrote to the call. So, we may have written some data to the call
    successfully, and then we went off to do something else. We can then
    kill the call later for taking too long to write to, even though
    nobody is trying to write to it. This results in a few problems:
    
     (1) When the fileserver is writing to the client, it may need to wait
         for various locks and it may need to wait for local disk I/O to
         complete. If this takes too long for any reason, the fileserver
         will kill the call (currently with VNOSERVICE), but the thread
         for servicing the call will still keep running until whatever the
         fileserver was waiting for finishes.
    
     (2) lastSendData is set whenever we send any ACK besides an
         RX_ACK_PING_RESPONSE (as of commit
         658d2f47281306dfd46c5eddcecaeadc3e3e7fa9). If we are the server,
         and we send any such ACK (in particular, RX_ACK_REQUESTED is
         common), the "idle dead" timer starts. This means the server can
         easily kill a call for idleness even if the server has never sent
         the client anything, and even if the server is still actively
         reading from the client.
    
     (3) When a client tries to issue an RPC for the server, the "idle
         dead" timeout effectively becomes a hard dead timeout, since we
         will write the RPC arguments to the Rx stream, and then wait for
         the server to respond with the output arguments. During this
         time, our 'lastSendData' is the last time we sent our arguments
         to the server, and so the call must finish before
         'call->lastSendData + idleDeadTime' is in the past.
    
    In addition to this "idle dead" processing, commit
    05f3a0d1e0359f604cc6162708f3f381eabcd1d7 appears to attempt to provide
    "idle dead"-like behavior by disabling Rx keepalives at certain points
    (when we're waiting for disk I/O), controlled by the application
    process (currently only the fileserver). The idea is that if
    keepalives are disabled, the server will just appear unreachable to
    the client, and so if disk I/O takes too long, the client will just
    kill the call because it looks like the server is gone. However, this
    also has some problems:
    
     (A) Clients send their own keepalives, and the server will still
         respond to them. So, the server will not appear to be
         inaccessible anyway. But even if it did work:
    
     (B) This approach only accounts for delays in disk I/O, and not
         anywhere else (we could hang for a wide variety of reasons). It
         also requires the fileserver to decide when it's okay for a call
         to be killed due to "idle dead" and when it's not, which
         currently seems to be decided somewhat arbitrarily.
    
     (C) This doesn't really let the client dictate its own "idle dead"
         timeout for idleness specifically; it just looks like the server
         went away.
    
     (D) The fileserver would appear to be unreachable in this situation,
         but it's not actually unreachable. This can be confusing to
         clients, since distinguishing between a server that is completely
         down vs just taking too long is an important distinction.
    
     (E) As noted in (1) above, the fileserver thread will still keep
         waiting for whatever it has been waiting for, even though the
         call has been killed and is thus useless.
    
    So instead of all of this stuff, just modify the rxi_CheckCall "idle
    dead" check to depend on the call->startWait parameter instead. This
    parameter will be set whenever anyone is waiting for something to
    proceed in the call, whether that is waiting to read data or write
    data. This should make "idle dead" processing much simpler, as it is
    reduced to effectively: if we've been waiting for longer than N
    seconds, kill the call.
    
    This involves ripping out much of the code related to lastSendData and
    rx_KeepAlive*. This means removing the call->lastSendData field and
    the rx_SetServerIdleDeadErr function, since those were only used for
    the behavior in c26dc0e6aaefedc55ed5c35a5744b5c01ba39ea1. This also
    means removing rx_KeepAliveOn and rx_KeepAliveOff, since those were
    only used for the behavior in
    05f3a0d1e0359f604cc6162708f3f381eabcd1d7. This commit also removes the
    only known use of the VNOSERVICE error code, so add some comments
    saying what this code was used for (especially since it is quite
    different from other V* error codes).
    
    Note that the behavior in (1) could actually be desirable in some
    situations. In environments that have clients without "idle dead"
    functionality, and those clients cannot be upgraded or reconfigured,
    this commit means those clients may hang forever if the server hangs
    forever. Some sites may want the fileserver to be able to kill such
    hanging calls, so the client will not hang (even if it doesn't free up
    the fileserver thread). However, such behavior should really be a
    special case for such sites, and not be the default behavior (or only
    behavior) for all sites. The fileserver should just be concerned with
    maintaining its own threads and availability, and clients should
    manage their own timeouts and handle hanging servers.
    
    Thanks to Markus Koeberl, who originally brought attention to some of
    the problematic behavior here, and helped investigate what was going
    on in the fileserver.
    
    Change-Id: Ie0497d24f1bf4ad7d30ab59061f96c3298f47d17
    Reviewed-on: http://gerrit.openafs.org/10773
    Reviewed-by: Daria Brashear <shadow@your-file-system.com>
    Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
    Tested-by: BuildBot <buildbot@rampaginggeek.com>

 src/afs/volerrors.h            |    5 +++-
 src/libafsrpc/afsrpc.def       |    4 +-
 src/libafsrpc/libafsrpc.la.sym |    2 -
 src/rx/liboafs_rx.la.sym       |    2 -
 src/rx/rx.c                    |   45 +++------------------------------------
 src/rx/rx.h                    |    7 ------
 src/rx/rx_call.h               |    1 -
 src/rx/rx_prototypes.h         |    2 -
 src/util/errors.h              |    5 +++-
 src/viced/afsfileprocs.c       |   41 ------------------------------------
 src/viced/viced.c              |    1 -
 src/volser/volser.p.h          |    5 +++-
 12 files changed, 18 insertions(+), 102 deletions(-)

-- 
OpenAFS Master Repository