[OpenAFS] OpenAFS 1.2.9 fileserver coredumped

Renata Maria Dart Renata Maria Dart <renata@SLAC.Stanford.EDU>
Fri, 23 Jan 2004 10:37:58 -0800 (PST)


>Date: Fri, 23 Jan 2004 13:30:19 -0500 (EST)
>From: Derrick J Brashear <shadow@dementia.org>
>Subject: Re: [OpenAFS] OpenAFS 1.2.9 fileserver coredumped
>X-X-Sender: shadow@johnstown.andrew.cmu.edu
>To: openafs-info@openafs.org
>MIME-version: 1.0
>Content-transfer-encoding: 7BIT
>Delivered-to: openafs-info@openafs.org
>X-Spam-Score: -0.734 () BAYES_10
>X-Scanned-By: MIMEDefang 2.38
>X-BeenThere: openafs-info@openafs.org
>X-Mailman-Version: 2.0.4
>X-PMX-Version: 4.1.1.86173
>List-Post: <mailto:openafs-info@openafs.org>
>List-Subscribe: <https://lists.openafs.org/mailman/listinfo/openafs-info>, 
<mailto:openafs-info-request@openafs.org?subject=subscribe>
>List-Unsubscribe: <https://lists.openafs.org/mailman/listinfo/openafs-info>, 
<mailto:openafs-info-request@openafs.org?subject=unsubscribe>
>List-Archive: <https://lists.openafs.org/pipermail/openafs-info/>
>List-Help: <mailto:openafs-info-request@openafs.org?subject=help>
>List-Id: OpenAFS Info/Discussion <openafs-info.openafs.org>
>
>On Fri, 23 Jan 2004, Renata Maria Dart wrote:
>
>> Hi, we have 8 solaris 9 fileservers running a mixture of OpenAFS
>> 1.2.9 and 1.2.10.  They 1.2.9 fileservers have all been running
>> uneventfully since last September until last night when one of
>> them restarted and left a corefile.fs:
>
>Look at the start of the FileLog.old, the assert() can result in the
>beginning of the log being overwritten

Hmmm, doesn't look like the start of FileLog.old got overwritten:

Mon Sep 15 10:24:15 2003 File server starting
Mon Sep 15 10:24:16 2003 afs_krb_get_lrealm failed, using slac.stanford.edu.
Mon Sep 15 10:24:16 2003 Set thread id 62 for FSYNC_sync
Mon Sep 15 10:24:26 2003 Partition /vicepb: attached 373 volumes; 0 volumes not 
attached

>
>> =>[1] _lwp_kill(0x0, 0x6, 0x0, 0xff1bc000, 0x5, 0x248800a), at 0xff19e42c
>>   [2] raise(0x6, 0x0, 0xf95fb958, 0xff1bc000, 0x0, 0x0), at 0xff14cd70
>>   [3] abort(0x0, 0xe4f0c, 0xf95fb9e8, 0x117320, 0x2bd, 0x0), at 0xff135c60
>>   [4] AssertionFailed(0x117320, 0x2bd, 0x2, 0xf95fba00, 0x125f00, 0x1400), at
>> 0x4a500
>>   [5] VPutVnode_r(0xf95fbb2c, 0xb384c0, 0x65fbb8, 0x12197c, 0x6a7f68, 
0x65a738),
>> at 0x521f4
>>   [6] VPutVnode(0xf95fbb2c, 0xb384c0, 0x12c930, 0x12197c, 0x1218b2, 0x834), 
at
>> 0x52060
>>   [7] PutVolumePackage(0x0, 0xb384c0, 0xac4928, 0xf10058, 0x0, 0x12ec00), at
>> 0x389cc
>
>>
>> Since this fileserver has restarted, it is now running 1.2.10.  I would
>> like to know if the cause of this failure has been fixed in 1.2.10 and
>> if I should just upgrade all of my 1.2.9 systems, or is this a problem
>> that still needs to be addressed.
>
>If it's
>            assert(vnp->cacheCheck == vp->cacheCheck);
>
>then it was addressed with less than complete success in 1.2 and with much
>more success in 1.3.

Since there was no further info at the top of FileLog.old, (I in fact
search for the string assert in the entire log), is this another problem?
Is there any other information I can get for you?

-Renata


>
>cacheCheck went from being a short with danger of wrapping to a long.
>
>
>_______________________________________________
>OpenAFS-info mailing list
>OpenAFS-info@openafs.org
>https://lists.openafs.org/mailman/listinfo/openafs-info

 Renata Dart                         | renata@SLAC.Stanford.edu  
 Stanford Linear Accelerator Center  |    
 2575 Sand Hill Road, MS 97          | (650) 926-2848 (office)
 Stanford, California   94025        | (650) 926-3329 (fax)