[OpenAFS-devel] stability problems, and interesting symptoms.
..
Neulinger, Nathan
nneul@umr.edu
Wed, 30 May 2001 12:09:31 -0500
I added a pile of debugging to volume.c and volprocs.c and came to this:
fdP = IH_OPEN(h);
if (fdP == NULL) {
Log("ReadHeader: %s:%d\n", __FILE__, __LINE__);
*ec = VSALVAGE;
return;
}
in ReadHeader in volume.c... The IH_OPEN is failing. I'm trying to bump up
inode-max and file-max on the box in question - we'll see if that makes any
difference.
-- Nathan
> -----Original Message-----
> From: Neulinger, Nathan [mailto:nneul@umr.edu]
> Sent: Wednesday, May 30, 2001 10:46 AM
> To: 'openafs-devel@openafs.org'
> Subject: [OpenAFS-devel] stability problems, and interesting
> symptoms...
>
>
> I've got two problems and one interesting symptom, though
> probably not of
> any relation to the first problem.
>
> First, on a couple of my servers (and this started happening
> sometime back
> about a month or so with no apparent changes to server
> hardware or software)
> - if I start moving volumes off the server en-masse, one at a
> time, one
> after another, at some point in the process, 50-100 volumes
> have been moved,
> I get a volserver error complaining about being unable to
> attach a volume.
> Once that happens, from then on out, any listvol or volserver activity
> against the server fails. Usually bos status indicates that
> vol exited with
> signal 6 although not necessarily immediately (I haven't seen
> that with
> openafs yet, but that was typically what I saw with 3.6-2.3).
> I have no
> error messages from the volserver other than this - and basically no
> indication that anything is wrong.
>
> I get the error both with transarc 3.6-2.3 and openafs-cvs.
>
> Syslogs looks like this:
> ----
> (lots and lots of stuff like the next few lines for the other
> volumes that
> moved ok.)
> May 30 10:30:18 afs4 fileserver[511]: fssync: volume
> 537013509 moved to
> 63019783; breaking all call backs
> May 30 10:30:18 afs4 volserver[483]: 1 Volser: Delete: volume
> 537013509
> deleted
> May 30 10:30:18 afs4 volserver[483]: 1 Volser: Delete: volume
> 537013511
> deleted
> May 30 10:30:18 afs4 volserver[483]: 1 Volser: Delete: volume
> 537020173
> deleted
> May 30 10:30:20 afs4 volserver[483]: 1 Volser: Clone: Cloning volume
> 536897629 to new volume 537020174
> May 30 10:30:20 afs4 fileserver[511]: fssync: volume
> 536897629 moved to
> 63019783; breaking all call backs
> May 30 10:30:20 afs4 volserver[483]: 1 Volser: Delete: volume
> 536897629
> deleted
> May 30 10:30:20 afs4 volserver[483]: 1 Volser: Delete: volume
> 536897631
> deleted
> May 30 10:30:22 afs4 volserver[483]: 1 Volser: Delete: volume
> 537020174
> deleted
> May 30 10:30:23 afs4 volserver[483]: VAttachVolume: Error
> attaching volume
> /vicepd/V0536906941.vol; volume needs salvage
> May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes:
> Could not attach
> volume 536906941
> May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes:
> Could not attach
> volume 536985904
> May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes:
> Could not attach
> volume 536889228
> May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes:
> Could not attach
> volume 536924071
> May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes:
> Could not attach
> volume 536896750
> May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes:
> Could not attach
> volume 536897341
> May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes:
> Could not attach
> volume 536983233
> May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes:
> Could not attach
> volume 536906834
> (tons of that for every volume on the server, and happens
> again if you do a
> vos listvol against the server.)
> -----
>
> The other symptom - when clearing off a server, I happened to
> notice that
> the volserver seemed to hang (and not respond to any new
> client requests
> such as vos partinfo) if I started a vos release against it.
> Once the vos
> release (in particular the ForwardMulti) completed, the
> volserver responded
> again. I'm not talking about a huge volume - maybe 5-10 megs
> with a few
> thousand files in it.
>
> I'm running volserver with no options in both cases.
>
> -- Nathan
>
> ------------------------------------------------------------
> Nathan Neulinger EMail: nneul@umr.edu
> University of Missouri - Rolla Phone: (573) 341-4841
> Computing Services Fax: (573) 341-4216
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel
>