[OpenAFS-devel] Re: Breaking callbacks on unlink

Andrew Deason adeason@sinenomine.net
Thu, 26 Jan 2012 20:11:38 -0600


On Thu, 26 Jan 2012 10:51:14 -0800
Russ Allbery <rra@stanford.edu> wrote:

> Andrew Deason <adeason@sinenomine.net> writes:
> 
> > If you want my opinion on what the _reason_ is, it's just that your
> > high rate of pag generation and high rate of writes is more than the
> > fileserver can handle, which is why I only ever see this stuff come
> > from you (at least, to this degree).
> 
> But, of course, it's not only me.  There are at least three sites that
> I know of that are seriously impacted by these sorts of reliability
> issues under load, and it's worth remembering that we only hear from a
> small percentage of sites.

I think I need to apologize about this. Some of this thread looks to be
implying that "nobody" sees these issues besides Russ and those few
other sites, but that's not true. I didn't really mean that; that
comment above really was just meant in the context of trying to think
about what the issue could be ("what is specific to stanford?"), and I'm
not trying to belittle the issue.

While the sites I talk to frequently do tend to run pretty well with
1.4-based builds these days... they do so with a pile of patches on top
of 1.4.14, 1.4.12, 1.4.7, etc. So, if we are talking about just the
public releases (and in retrospect I'm not sure what else we'd be
talking about), yes, those have significant stability issues for higher
volume sites, and none of the public releases are, by some definitions,
useable for them. Maybe those sites would have had stability issues that
look like this, but they were found and patched early on.

-- 
Andrew Deason
adeason@sinenomine.net