[OpenAFS] Invalid cross-device link

Stefan Strandberg stefan@cae.wisc.edu
Thu, 20 Aug 2009 09:19:01 -0500


Hi,

1.4.11 isn't really doable until it's at least in lenny-backports as we
don't want to roll our own versions of this.

As for a stale replica existing, I may be misunderstanding.  If you're
saying that it's a replica of that volume, I don't see how that's the
case.

Here's the creation and subsequent release attempt for a brand new
volume:

stefan@cog ~ $ vos create beth a foo.bar
Volume 536885610 created on partition /vicepa of beth
stefan@cog ~ $ vos addsite beth b foo.bar
Added replication site beth /vicepb for volume foo.bar
stefan@cog ~ $ vos rel -v foo.bar

foo.bar=20
    RWrite: 536885610=20
    number of sites -> 2
       server beth.cae.wisc.edu partition /vicepa RW Site=20
       server beth.cae.wisc.edu partition /vicepb RO Site  -- Not release=
d
This is a complete release of volume 536885610
Cloning RW volume 536885610 to temporary RO... done
Getting status of RW volume 536885610... done
Ending cloning transaction on RW volume 536885610... done
Starting transaction on cloned volume 536885611... done
Creating new volume 536885611 on replication site beth.cae.wisc.edu: Fail=
ed to create the ro volume: : Input/output error
The volume 536885610 could not be released to the following 1 sites:
                          beth.cae.wisc.edu /vicepb
VOLSER: release could not be completed
Error in vos release command.
VOLSER: release could not be completed

And here's the VolserLog output:

Thu Aug 20 09:15:14 2009 1 Volser: CreateVolume: volume 536885610 (foo.ba=
r) created
Thu Aug 20 09:15:24 2009 1 Volser: Clone: Cloning volume 536885610 to new=
 volume 536885611
Thu Aug 20 09:15:24 2009 VAttachVolume: Failed to open /vicepb/V053688561=
1.vol (errno 2)
Thu Aug 20 09:15:24 2009 1 Volser: CreateVolume: Unable to create the vol=
ume; aborted, error code 18
Thu Aug 20 09:15:24 2009 : Invalid cross-device link

Turning up debugging doesn't show any extra anything really.

Thanks again,

-stefan

On Thu, Aug 20, 2009 at 09:42:46AM -0400, Derrick Brashear wrote:
> On Thu, Aug 20, 2009 at 9:39 AM, Jeffrey
> Altman<jaltman@secure-endpoints.com> wrote:
> > Stefan Strandberg wrote:
> >> Anyone have any ideas? =A0I would really like to get everything on 1=
.4.10
> >> for the performance increases.
> >
> > The current version of OpenAFS is 1.4.11 which addresses:
> >
> > - Fix race in background sync code which could cause volumes to go
> > =A0offline. (124359)
> >
> > This is not the issue you are describing. =A0However, please test wit=
h
> > 1.4.11 and see if the problem is still present. =A0 If so, send logs =
and
> > report to openafs-bugs@openafs.org.
>=20
>=20
> it will still be present. the real problem is you have a stale copy of
> the replica elsewhere on the disk. there should be exactly one copy of
> 536885604, and it should be on the same partition as 536885602, both
> according to the vldb and in vos listvol output. arrange to make that
> true, and your issue will go away.
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
>=20

--=20
Stefan Strandberg
UNIX group
Computer Aided Engineering - UW Madison
stefan@cae.wisc.edu