[OpenAFS] compile fails kernel version 4.4.0-1-default

Stephan Wiesand stephan.wiesand@desy.de
Tue, 8 Mar 2016 14:37:05 +0100


> On 08 Mar 2016, at 12:02, Chas Williams <3chas3@gmail.com> wrote:
>=20
> On Tue, 2016-03-08 at 10:16 +0100, Denis Lohner wrote:
>> Am 08.03.2016 um 04:37 schrieb Benjamin Kaduk:
>>>>> There
>>>>>>> are many call paths in the cache manager that end up at this =
function,
>>>>>>> most of which are not prepared to properly handle an ERESTARTSYS
>>>>>>> return.
>>>>>>> Since this status can be returned after some data has already =
been
>>>>>>> written, the correct behavior upon receiving it is far from =
clear ...
>>>>>>> a
>>>>>>> path towards a client free of this vector for data corruption =
may
>>>>>>> involve
>>>>>>> avoiding the dependence on splice_from_pipe_next() in preference =
to
>>>>>>> adopting all call sites to handle the ERSTARTSYS case.
>>>>>=20
>>>>> For the 1.6 release, this seems the best choice of action.  The =
"real"
>>>>> fix would likely be difficult to completely test in a timely =
fashion.
>>> That only helps if we know what the replacement would be...I am not =
a
>>> linux VFS expert and do not have any ideas right now.
>>=20
>>=20
>> I am not a kernel/driver developer nor a file system developer. So
>> please forgive, if the following makes no sense at all.
>>=20
>> As far as I understand the issue and the openafs sources, the problem
>> arises as afs_linux_storeproc uses the splice api that can return
>> ERESTARTSYS as of kernel version 4.4.
>> A quick search in the NEWS file and git logs suggests that
>> afs_linux_storeproc was introduced in OpenAFS 1.5.69 (2010-01-19) as =
a
>> performance improvement:
>> " Linux
>>=20
>>    * Use splice to speed up storing files."
>>=20
>> The original behaviour which uses seperate reads/writes instead of
>> splice and that is (still) used on non-linux systems remained in
>> afs_GenericStoreProc in src/afs/afs_fetchstore.c .
>>=20
>> So my question is: Is it possible to rereplace afs_linux_storeproc =
with
>> afs_GenericStoreProc on linux kernel versions >=3D4.4  as a temporary
>> solution to the issue either in the openafs sources or as a =
distribution
>> specific patch, trading some performance for data integrity?
>=20
> That would be the first thing I tried.  This code was brought into the
> tree on commit 34ffc9cd7d7eed62229704ad0e1d327f076ea7b6.  There =
doesn't seem
> to be any additional side effects, so simply not using =
afs_linux_storeproc should
> still work.

So we'd simply do something like this


diff --git a/src/afs/afs_fetchstore.c b/src/afs/afs_fetchstore.c
index f65f40c..2630209 100644
--- a/src/afs/afs_fetchstore.c
+++ b/src/afs/afs_fetchstore.c
@@ -326,7 +326,7 @@ struct storeOps rxfs_storeUfsOps =3D {
     .padd =3D    rxfs_storePadd,
     .close =3D   rxfs_storeClose,
     .destroy =3D rxfs_storeDestroy,
-#ifdef AFS_LINUX26_ENV
+#if defined(AFS_LINUX26_ENV) && defined(LINUX_USE_SPLICE)
     .storeproc =3D afs_linux_storeproc
 #else
     .storeproc =3D afs_GenericStoreProc


and add a configure test defaulting to off?


--=20
Stephan Wiesand
DESY - DV -
Platanenallee 6
15738 Zeuthen, Germany