[OpenAFS] compile fails kernel version 4.4.0-1-default

Benjamin Kaduk kaduk@MIT.EDU
Tue, 1 Mar 2016 22:31:00 -0500 (EST)


  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

---559023410-829177858-1456889082=:26829
Content-Type: TEXT/PLAIN; charset=UTF-8
Content-Transfer-Encoding: QUOTED-PRINTABLE
Content-ID: <alpine.GSO.1.10.1603012224571.26829@multics.mit.edu>

On Tue, 1 Mar 2016, Michael La=C3=9F wrote:

> Hi!
>
> Am 23.01.2016 um 18:22 schrieb Benjamin Kaduk <kaduk@MIT.EDU>:
> >
> > Though the patches linked there are sufficient to permit the build to
> > complete, there are some more subtle behavior changes in the kernel in
> > that some of the splice functions will now return ERESTARTSYS if there =
is
> > any signal pending in the current process.  In particular, there are
> > presumed to be codepaths for which we do not have proper error handling=
,
> > that could lead to data loss.  Further analysis is needed (which I am n=
ot
> > prepared to undertake at present).
>
> It seems like you were spot on with this. Some Arch Linux users have been=
 brave enough to test OpenAFS with these patches on Linux 4.4. One reported=
 a data corruption issue now. Quote from https://aur.archlinux.org/packages=
/openafs/:
>
> > I tried the patch and I get problems. When I do checkout a different br=
anch of my software from a git repository things fail and I'm left with a c=
orrupted workspace. The log files shows the following message:
> > kernel: afs: Lost contact with file server ... in cell ... (code -512) =
(all multi-homed ip addresses down for the server)
> > kernel: afs: failed to store file (network problems)
> > kernel: afs: file server ... in cell ... is back up (code 0) (multi-hom=
ed address; other same-host interfaces may still be down)
>
>
> Guess what error core -512 is=E2=80=A6 Yep, it=E2=80=99s -ERESTARTSYS.
>
> So there is definitely some additional work required for Linux 4.4.

Hi Michael,

Thank you for reporting this back to the list.  To the list members:

I would like to point out that no openafs developer has stated that they
are working on this issue, and it appears that a proper fix will require
modifications through many different parts of the cache manager; that is,
it will be an invasive change that requires substantial development
effort.  There is a real risk that OpenAFS will not be able to support
kernels from the 4.4 series and newer -- the openafs package is slated for
removal from Debian testing in just three weeks.

If there are sites that will be adversely affected by the lack of a
functioning openafs client for linux kernel 4.4 or newer, it will be
easier if they can contribute resources now, rather than months from now
when these kernels make their way into the linux distributions deployed at
these sites.

-Ben
---559023410-829177858-1456889082=:26829--