[OpenAFS] vos release
Russ Allbery
rra@stanford.edu
Thu, 08 Aug 2002 14:22:15 -0700
Derrick J Brashear <shadow@dementia.org> writes:
> On Thu, 8 Aug 2002, Russ Allbery wrote:
>> Just as a data point, it's not clear to me that this always has
>> something to do with network problems. We've seen exactly the same
>> behavior on the campus network with no noticable network difficulties
>> between the servers. Every so often the volume release would just not
>> work; usually it would involve "possible communication failure" errors
>> and usually errors about being unable to start a transaction. It
>> seemed to be strongly correlated
> I think a fix in OpenAFS 1.2.6 will help this. Particularly, Brent
> Johnson mentioned something to me at Usenix and based on that we made a
> change in the fssync interface. I'm told IBM made an analogous change
> sometime recently also.
So far, I seem to be having fewer problems, but they're not gone. With
the first volume release I did today, I got the same problem that I'd
gotten before:
(root) windlord:~> alias rfv
vos release -f -v
(root) windlord:~> rfv pubsw.siteemacs
pubsw.siteemacs
RWrite: 2003896810 ROnly: 2003896811 Backup: 2003896812
number of sites -> 4
server afssvr22.Stanford.EDU partition /vicepj RW Site
server afssvr22.Stanford.EDU partition /vicepj RO Site
server afssvr23.Stanford.EDU partition /vicepm RO Site
server afssvr11.Stanford.EDU partition /vicepd RO Site
This is a complete release of the volume 2003896810
Recloning RW volume ...
Failed to end cloning transaction on RW 2003896811
Possible communication failure
Error in vos release command.
Possible communication failure
(root) windlord:~> rfv pubsw.siteemacs
pubsw.siteemacs
RWrite: 2003896810 ROnly: 2003896811 Backup: 2003896812
number of sites -> 4
server afssvr22.Stanford.EDU partition /vicepj RW Site
server afssvr22.Stanford.EDU partition /vicepj RO Site
server afssvr23.Stanford.EDU partition /vicepm RO Site
server afssvr11.Stanford.EDU partition /vicepd RO Site
This is a complete release of the volume 2003896810
Recloning RW volume ...
Updating existing ro volume 2003896811 on afssvr23.Stanford.EDU ...
Starting ForwardMulti from 2003896811 to 2003896811 on afssvr23.Stanford.EDU (full release).
Updating existing ro volume 2003896811 on afssvr11.Stanford.EDU ...
Starting ForwardMulti from 2003896811 to 2003896811 on afssvr11.Stanford.EDU (full release).
updating VLDB ... done
Released volume pubsw.siteemacs successfully
Perhaps the -f flag at this point has something to do with it? We
standardized on always using it at some point in the past when Transarc
AFS would corrupt the volume unless -f was given, and then never changed
back.
In a bunch of subsequent volume releases, I haven't had any trouble except
for an occasional:
Could not end transaction on a ro volume: Possible communication failure
right before the volume release finishes that doesn't seem to have
interfered with the success of the release.
--
Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>