[OpenAFS] file corruption redux

Miles Davis miles@CS.Stanford.EDU
Wed, 31 May 2006 10:23:21 -0700


On Wed, May 31, 2006 at 01:08:14PM -0400, Derrick J Brashear wrote:
> On Wed, 31 May 2006, Miles Davis wrote:
> 
> >Sure, I suppose, but I can't think of what could do it -- cpu/cache? 
> >Magical
> >corrupting ethernet interface or driver (intel e1000)?
> 
> I doubt it. But it was worth asking.
> 
> >OK, here we go: cmp -l aspell-bg-0.50-9.i386.rpm 
> >/tmp/aspell-bg-0.50-9.i386.rpm
> >1683429 377 177
> >
> >(same from at least two clients)
> >
> >tcpdump (4.4MB) file is at http://cs.stanford.edu/people/miles/tcpdump.out
> >Server is 171.64.64.67, client is 171.64.64.132.
> 
> That's a single bit error. That screams bad hardware. I will look at the 
> tcpdump, though.

Bugger. Well, while I have your attention, do you have an educated guess as to 
what I should yank & replace next? I already replaced the memory, and it's 
single-bit ECC...I haven't managed to get any failures from memtest86, but then 
again I don't recall ever getting memtest86 to find an error.

-- 
// Miles Davis - miles@cs.stanford.edu - http://www.cs.stanford.edu/~miles
// Computer Science Department - Computer Facilities
// Stanford University