[OpenAFS] Re: File corruption, 1.4.1 & 1.4.4 on linux clients

cball@bu.edu cball@bu.edu
Thu, 26 Apr 2007 08:32:23 -0400 (EDT)


On Wed, 25 Apr 2007, Derrick J Brashear wrote:

> On Tue, 24 Apr 2007 cball@bu.edu wrote:
>
> > We are serving up a virus .dat file to mail relays via AFS readonly.
> > The file is periodically updated, the volume where it lives is re-released
> > hourly whether update occured or not.  Read activity is constant.
> >
> > When vos release occurs, the fileserver logs a message like this:
> >
> > Mon Apr 23 17:04:28 2007 fssync: volume 536959020 restored; breaking all
> > call backs
> >
> > [ normal behavior ]
> >
> > At erratic intervals, the virus scanner on one of our mail relay systems
> > will choke on the database file reporting that it's invalid.  When this
> > happens, the file remains invalid until a re-release occurs or a manual fs
> > flush is invoked.
>
> Let me guess, it's mmap()ed by whatever is using it, directly in /afs?

The file is not being mmap()ed.  If I wasn't clear, the affected client
system is consistant about serving up the corrupted file with null bytes
to new and old processes until the cached version of the file is flushed.

-Charles

----
strace uvscan --version
[...]
stat64("/afs/[foobar]/scan.dat", {st_mode=S_IFREG|0666, st_size=10361745,
...})
= 0
lstat64("/afs/[foobar]/scan.dat", {st_mode=S_IFREG|0666, st_size=10361745,
...}) = 0
access("/afs/[foobar]/scan.dat", R_OK) = 0
open("/afs/[foobar]/scan.dat", O_RDONLY|O_LARGEFILE) = 4
access("/afs/[foobar]/scan.dat", W_OK) = -1
EROFS (Read-only file system)
[...]
----