OpenAFS Master Repository branch, openafs-stable-1_8_x, updated. openafs-stable-1_8_2-28-g8ca82f1
Gerrit Code Review
gerrit@openafs.org
Fri, 25 Jan 2019 12:45:56 -0500
The following commit has been merged in the openafs-stable-1_8_x branch:
commit 8ca82f1252db2e3f17f6a9080f56d74035bdaa16
Author: Andrew Deason <adeason@sinenomine.net>
Date: Thu Jan 17 00:12:06 2019 -0600
afs: Do not ignore errors in afs_CacheFetchProc
afs_CacheFetchProc currently has a section of code that looks like
this pseudocode:
if (!code) do {
while (length > 0) {
code = read_from_rx();
if (code) {
break;
}
code = write_to_cache();
if (code) {
break;
}
}
code = 0;
} while (moredata);
return code;
When we encounter an error when reading from rx or writing to the
cache, we break out of the current loop to stop processing and return
an error. But there are _two_ loops in this section of the code, so
what we actually do is break out of the inner loop, set 'code' to 0,
and then usually return (since 'moredata' is usually never set).
This means that when we encounter an unexpected error either from the
net or disk (or the memcache layer), we ignore the error and return
success. This means that we'll store a subset of the relevant chunk's
data to disk, and flag that chunk as complete and valid for the
relevant DV. If the error occurred before we wrote anything to disk,
this means we'll store an empty chunk and flag it as valid. The chunk
will be flagged as valid forever, serving invalid data, until the
cache chunk is evicted or manually kicked out. This can result in
files and directories appearing blank or truncated to applications
until the bad chunk is removed.
Possibly the most common way to encounter this issue is when using a
disk cache, and the underlying disk partition is full, resulting in an
unexpected ENOSPC error. Theoretically this can be seen from an
unexpected error from Rx, but we would have to see a short read from
Rx without the Rx call being aborted. If the call was aborted, we'd
get an error from the call to rx_EndCall() later on.
To fix this, change all of these 'break's into 'goto done's, to be
more explicit about where we are jumping to. Convert all of the
'break's in this function in the same way, to make the code flow more
consistent and easier to follow. Remove the 'if () do' on a single
line, since it makes it a little harder to see from a casual glance
that there are two nested loops here.
This problem appears to have been introduced in commit 61ae8792 (Unite
CacheFetchProcs and add abstraction calls), included in OpenAFS
1.5.62.
Reviewed-on: https://gerrit.openafs.org/13428
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Cheyenne Wills <cwills@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit ce327b568f4ff522aa008f235d97e0d9144eb92c)
Change-Id: Id4ec8ffef38b4c86beffc6272bd283bce2c74ffe
Reviewed-on: https://gerrit.openafs.org/13443
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
src/afs/afs_fetchstore.c | 13 +++++++++----
1 files changed, 9 insertions(+), 4 deletions(-)
--
OpenAFS Master Repository