[OpenAFS-devel] stability problems, and interesting symptoms...
Neulinger, Nathan
nneul@umr.edu
Wed, 30 May 2001 10:46:23 -0500
I've got two problems and one interesting symptom, though probably not of
any relation to the first problem.
First, on a couple of my servers (and this started happening sometime back
about a month or so with no apparent changes to server hardware or software)
- if I start moving volumes off the server en-masse, one at a time, one
after another, at some point in the process, 50-100 volumes have been moved,
I get a volserver error complaining about being unable to attach a volume.
Once that happens, from then on out, any listvol or volserver activity
against the server fails. Usually bos status indicates that vol exited with
signal 6 although not necessarily immediately (I haven't seen that with
openafs yet, but that was typically what I saw with 3.6-2.3). I have no
error messages from the volserver other than this - and basically no
indication that anything is wrong.
I get the error both with transarc 3.6-2.3 and openafs-cvs.
Syslogs looks like this:
----
(lots and lots of stuff like the next few lines for the other volumes that
moved ok.)
May 30 10:30:18 afs4 fileserver[511]: fssync: volume 537013509 moved to
63019783; breaking all call backs
May 30 10:30:18 afs4 volserver[483]: 1 Volser: Delete: volume 537013509
deleted
May 30 10:30:18 afs4 volserver[483]: 1 Volser: Delete: volume 537013511
deleted
May 30 10:30:18 afs4 volserver[483]: 1 Volser: Delete: volume 537020173
deleted
May 30 10:30:20 afs4 volserver[483]: 1 Volser: Clone: Cloning volume
536897629 to new volume 537020174
May 30 10:30:20 afs4 fileserver[511]: fssync: volume 536897629 moved to
63019783; breaking all call backs
May 30 10:30:20 afs4 volserver[483]: 1 Volser: Delete: volume 536897629
deleted
May 30 10:30:20 afs4 volserver[483]: 1 Volser: Delete: volume 536897631
deleted
May 30 10:30:22 afs4 volserver[483]: 1 Volser: Delete: volume 537020174
deleted
May 30 10:30:23 afs4 volserver[483]: VAttachVolume: Error attaching volume
/vicepd/V0536906941.vol; volume needs salvage
May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: Could not attach
volume 536906941
May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: Could not attach
volume 536985904
May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: Could not attach
volume 536889228
May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: Could not attach
volume 536924071
May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: Could not attach
volume 536896750
May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: Could not attach
volume 536897341
May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: Could not attach
volume 536983233
May 30 10:30:23 afs4 volserver[483]: 1 Volser: ListVolumes: Could not attach
volume 536906834
(tons of that for every volume on the server, and happens again if you do a
vos listvol against the server.)
-----
The other symptom - when clearing off a server, I happened to notice that
the volserver seemed to hang (and not respond to any new client requests
such as vos partinfo) if I started a vos release against it. Once the vos
release (in particular the ForwardMulti) completed, the volserver responded
again. I'm not talking about a huge volume - maybe 5-10 megs with a few
thousand files in it.
I'm running volserver with no options in both cases.
-- Nathan
------------------------------------------------------------
Nathan Neulinger EMail: nneul@umr.edu
University of Missouri - Rolla Phone: (573) 341-4841
Computing Services Fax: (573) 341-4216