[OpenAFS] compile fails kernel version 4.4.0-1-default

Chas Williams 3chas3@gmail.com
Tue, 08 Mar 2016 06:02:35 -0500


On Tue, 2016-03-08 at 10:16 +0100, Denis Lohner wrote:
> Am 08.03.2016 um 04:37 schrieb Benjamin Kaduk:
> >>> There
> >>> > > are many call paths in the cache manager that end up at this function,
> >>> > > most of which are not prepared to properly handle an ERESTARTSYS
> >>> > > return.
> >>> > > Since this status can be returned after some data has already been
> >>> > > written, the correct behavior upon receiving it is far from clear ...
> >>> > > a
> >>> > > path towards a client free of this vector for data corruption may
> >>> > > involve
> >>> > > avoiding the dependence on splice_from_pipe_next() in preference to
> >>> > > adopting all call sites to handle the ERSTARTSYS case.
> >> >
> >> > For the 1.6 release, this seems the best choice of action.  The "real"
> >> > fix would likely be difficult to completely test in a timely fashion.
> > That only helps if we know what the replacement would be...I am not a
> > linux VFS expert and do not have any ideas right now.
> 
> 
> I am not a kernel/driver developer nor a file system developer. So
> please forgive, if the following makes no sense at all.
> 
> As far as I understand the issue and the openafs sources, the problem
> arises as afs_linux_storeproc uses the splice api that can return
> ERESTARTSYS as of kernel version 4.4.
> A quick search in the NEWS file and git logs suggests that
> afs_linux_storeproc was introduced in OpenAFS 1.5.69 (2010-01-19) as a
> performance improvement:
> " Linux
> 
>    * Use splice to speed up storing files."
> 
> The original behaviour which uses seperate reads/writes instead of
> splice and that is (still) used on non-linux systems remained in
> afs_GenericStoreProc in src/afs/afs_fetchstore.c .
> 
> So my question is: Is it possible to rereplace afs_linux_storeproc with
> afs_GenericStoreProc on linux kernel versions >=4.4  as a temporary
> solution to the issue either in the openafs sources or as a distribution
> specific patch, trading some performance for data integrity?

That would be the first thing I tried.  This code was brought into the
tree on commit 34ffc9cd7d7eed62229704ad0e1d327f076ea7b6.  There doesn't seem
to be any additional side effects, so simply not using afs_linux_storeproc should
still work.

I won't have time to look at this until this weekend though.