1.2.9 unstable ? (was [OpenAFS] inaccessibble volume - please help)

Andrew Bacchi bacchi@rpi.edu
17 Sep 2003 08:28:23 -0400


I experienced problems with 1.2.9 also.  Derek Brashier recommended I
upgrade to 1.2.10 and now everything is stable again.

I had fileserver processes catching a signal 11 and going into a salvage
loop every few minutes.  That stopped immediately after the upgrade.  I
also saw file system hangs on shutdown, that also has disappeared.

On Thu, 2003-05-08 at 03:08, jarausch@igpm.rwth-aachen.de wrote:
> I changed two things. First I upgraded from Linux kernel
> 2.4.21-rc1-ac2 to 2.4.21-rc1-ac4 (there was at least an
> IDE hang at shutdown in the previous version)
> 
> and I think more importantly I stepped back to OpenAFS 1.2.8
> and all my problems went away immediately.
> 
> Has anybody experienced similar problems with 1.2.9 ?
> 
> Thanks for your help,
> Helmut.
> 
> On Wed, 7 May 2003 Derrick J Brashear <shadow@dementia.org> wrote:
> > On Wed, 7 May 2003 jarausch@igpm.rwth-aachen.de wrote:
> > 
> > > Hi all
> > > please bear in mind that I not an experienced afs admin.
> > > Indeed, it's my first (tiny) afs installation with only a single
> > > server. This holds several volumes on partitions /vicepa .. /vicepc
> > > All except one volume are still accessible. I think I have seen an
> > > error message indicating something like disk full or full volume
> > > during a rsync transfer sunday night. But 'vos partinfo localhost'
> > > indicates lots of free space on each partition. Since the failing
> > > volume cannot be accessed anymore I cannot say if it's full.
> > > The BosLog for this sunday night shows a lot of messages like
> > > fs:vol exited on signal 15
> > > fs:salv exited with code 0
> > > fs:file exited with code 1
> > 
> > kill all the fileserver processes on the machine with kill -9.
> > 
> > the pthread problem of processes not dying completely is still there, in
> > different form.
> > 
> > > Could not fetch the list of volumes from the server
> > > Possible communication failure
> > > Error in vos listvol command.
> > > Possible communication failure
> > 
> > no volserver, no communication.
> > 
> > > This communication errors looks strange to me.
> > > All volumes are on localhost and 'bos status'
> > > doesn't show any unusual status.
> > 
> > it doesn't show proc starts ever increasing?
> > 
> > > What can I do to recover such a volume and what
> > > might have been the cause of the failure.
> > 
> > it's not clear any recovery is needed, yet.
> 
> -- 
> Helmut Jarausch
> 
> Lehrstuhl fuer Numerische Mathematik
> RWTH - Aachen University
> D 52056 Aachen, Germany
> 
> 
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
> 
-- 
Facade: Provide a unified interface to a set of interfaces in a
subsystem.

Andrew Bacchi
Staff Systems Programmer
Rensselaer Polytechnic Institute
phone: 518 276-6415  fax: 518 276-2809

http://www.rpi.edu/~bacchi/