[OpenAFS] OpenAFS 1.2.9 fileserver coredumped

Derrick J Brashear shadow@dementia.org
Fri, 23 Jan 2004 13:30:19 -0500 (EST)


On Fri, 23 Jan 2004, Renata Maria Dart wrote:

> Hi, we have 8 solaris 9 fileservers running a mixture of OpenAFS
> 1.2.9 and 1.2.10.  They 1.2.9 fileservers have all been running
> uneventfully since last September until last night when one of
> them restarted and left a corefile.fs:

Look at the start of the FileLog.old, the assert() can result in the
beginning of the log being overwritten

> =>[1] _lwp_kill(0x0, 0x6, 0x0, 0xff1bc000, 0x5, 0x248800a), at 0xff19e42c
>   [2] raise(0x6, 0x0, 0xf95fb958, 0xff1bc000, 0x0, 0x0), at 0xff14cd70
>   [3] abort(0x0, 0xe4f0c, 0xf95fb9e8, 0x117320, 0x2bd, 0x0), at 0xff135c60
>   [4] AssertionFailed(0x117320, 0x2bd, 0x2, 0xf95fba00, 0x125f00, 0x1400), at
> 0x4a500
>   [5] VPutVnode_r(0xf95fbb2c, 0xb384c0, 0x65fbb8, 0x12197c, 0x6a7f68, 0x65a738),
> at 0x521f4
>   [6] VPutVnode(0xf95fbb2c, 0xb384c0, 0x12c930, 0x12197c, 0x1218b2, 0x834), at
> 0x52060
>   [7] PutVolumePackage(0x0, 0xb384c0, 0xac4928, 0xf10058, 0x0, 0x12ec00), at
> 0x389cc

>
> Since this fileserver has restarted, it is now running 1.2.10.  I would
> like to know if the cause of this failure has been fixed in 1.2.10 and
> if I should just upgrade all of my 1.2.9 systems, or is this a problem
> that still needs to be addressed.

If it's
            assert(vnp->cacheCheck == vp->cacheCheck);

then it was addressed with less than complete success in 1.2 and with much
more success in 1.3.

cacheCheck went from being a short with danger of wrapping to a long.