[OpenAFS] Git throwing bus error when pack files not entirely cached

Benjamin Kaduk kaduk@MIT.EDU
Sun, 13 Mar 2016 16:53:57 -0400 (EDT)


  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

---559023410-1980862703-1457902437=:26829
Content-Type: TEXT/PLAIN; charset=UTF-8
Content-Transfer-Encoding: QUOTED-PRINTABLE

On Sun, 13 Mar 2016, Michael La=C3=9F wrote:

> Hi,
>
> while trying to stress test the proposed patches for Linux 4.4 I
> encountered an issue using git inside AFS that is not connected to the
> Linux 4.4 issue:
>
> I cloned the Linux kernel repo inside AFS. On this freshly cloned repo,
> all data is stored in a single .pack file (.git/objects/pack/pack-
> somehash.pack) that is about 1GB of size. When checking out revisions
> of this repository, git almost always aborts with a bus error and
> leaves the working copy in a broken state. This happens at different
> stages of the checkout and only in rare cases the checkout finishes
> without an error.
>
> Now here's the interesting part: If the OpenAFS cache is large enough
> and the pack file is entirely cached (e.g. by running "cat
> .git/objects/pack/pack-somehash.pack > /dev/null" before) the errors
> disappear completely. After flushing the cache (fs flushall) they
> reappear. The smaller the cache size, the more often the errors seem to
> occur.
>
> I repeatedly built a checksum of the pack file unter different
> circumstances and the checksum was always identical, so only git seems
> to see a problem here.
>
> File server versions tested:
> -=C2=A01.6.9-2+deb8u4 from Debian Stable (Jessie)
> -=C2=A01.6.15 from Debian Sid, running on Jessie
>
> Client versions tested:
> -=C2=A01.6.9-2+deb8u4 from Debian Jessie
> -=C2=A01.6.15-1 from Debian Sid, running on Jessie
> - 1.6.16 + proposed Linux-4.4 patches on Arch Linux
>
> Git versions tested:
> -=C2=A02.1.4-2.1+deb8u1 on Debian Jessie
> - 2.7.2 on Arch Linux
>
> Has this problem been observed before? Anything I can test to track
> this down? It's at least easily reproducible.

Thanks for the report, and the work to narrow down the reproduction case.

I don't think I've encountered this myself (on linux at least), but will
note that if I remember correctly, git uses mmap to access the packfile,
so that would be afs_linux_readpages() that needs closer examination.

-Ben
---559023410-1980862703-1457902437=:26829--