Aw: RE: [OpenAFS] compile fails kernel version 4.4.0-1-default

Chas Williams
Mon, 07 Mar 2016 13:41:46 -0500

On Mon, 2016-03-07 at 01:42 -0500, Benjamin Kaduk wrote:
> On Thu, 3 Mar 2016, Michael Dressel wrote:
> > Hi,
> >
> > it is me who reported the issue on archlinux. Let me know if I can
> help
> > with reproducing the issue or anything else.
> > For curiosity, could anyone explain to me what change in the 4.4
> kernel
> > created the issue for openafs? Or rather what was the intention of
> the
> > change in 4.4?
> I am given to understand that the proximal trigger is linux commit
> ?id=c725bfce7968009756ed2836a8cd7ba4dc163011,
> which addds a path wherein -ERESTARTSYS can be returned from within
> library code.  (Maybe there are other such paths, but we maybe just
> didn't
> notice before?)  This particular function, splice_from_pipe_next(),
> ends
> up getting called from the low-level afs_linux_storeproc() routine.  

I haven't had time to look at this, but does this also happen with
the memcache?

> There
> are many call paths in the cache manager that end up at this function,
> most of which are not prepared to properly handle an ERESTARTSYS
> return.
> Since this status can be returned after some data has already been
> written, the correct behavior upon receiving it is far from clear ...
> a
> path towards a client free of this vector for data corruption may
> involve
> avoiding the dependence on splice_from_pipe_next() in preference to
> adopting all call sites to handle the ERSTARTSYS case.

For the 1.6 release, this seems the best choice of action.  The "real"
fix would likely be difficult to completely test in a timely fashion.