[OpenAFS-devel] OpenAFS 1.4.0 rc3 crashes on Linux 2.6

Andy Lutomirski amluto@hotmail.com
Sun, 13 Nov 2005 21:54:27 +0000


>From: "chas williams - CONTRACTOR" <chas@cmf.nrl.navy.mil>
>To: "Andy Lutomirski" <amluto@hotmail.com>
>CC: openafs-devel@openafs.org
>Subject: Re: [OpenAFS-devel] OpenAFS 1.4.0 rc3 crashes on Linux 2.6 Date: 
>Mon, 24 Oct 2005 07:58:31 -0400
>
>In message <BAY106-F43222E50704F8710B4F08C1750@phx.gbl>,"Andy Lutomirski" 
>writes:
> >(gdb) info line *osi_UFSOpen+440
> >Line 72 of
> >"/var/tmp/portage/openafs-kernel-1.4.0_rc6/work/openafs-1.4.0-rc6/src/libafs/MODLOAD-2.6.13-gentoo-r3-SP/osi_file.c"
> >starts at address 0x523a8 <osi_UFSOpen+440> and ends at 0x523ad
> ><osi_UFSOpen+445>.
>
>this seems to indicate that it failed to open a file in your
>cache.  where is your cache located and what type of fs is
>your cache filesystem?
>
>...
>
>hmm... this could be trouble.  does gentoo have some sort of daemon
>that might cleanup /tmp?  or worse, is /tmp some sort of
>memory filesystem?  trying putting cache in /var (assuming
>var is ext3 or ext2).

I've confirmed that this is the problem.  I've both triggered it with 
explicit tmpreaper invokations and seen openafs be stable for 14 days 
without it.  (Lesson -- tmpreaper's --protect option doesn't actually 
protect directories.)

Is there any possibility of getting either a more informative crash message 
in future versions of openafs (like "Possible cache corruption detected") 
or, even better, a graceful failure instead of a crash (just fail the 
request instead of OOPSing)?

Thanks,
Andy

_________________________________________________________________
Don’t just search. Find. Check out the new MSN Search! 
http://search.msn.click-url.com/go/onm00200636ave/direct/01/