[OpenAFS] Invalid cross-device link

Stefan Strandberg stefan@cae.wisc.edu
Thu, 20 Aug 2009 08:02:10 -0500


Hi,

We have been running openafs 1.4.10 (specifically 1.4.10+dfsg1-1~bpo50+1
on Debian Lenny) for a month or so now on our fileservers without
incident.  Yesterday, we switched our VLDB servers from aging solaris
machines running 1.4.2 onto Debian lenny machines running 1.4.10.  An
upgrade from kaserver to krb5 was also done.

However, while everything worked fine for approximately an hour, things
went crazy afterwards.  After a few hours of unscheduled downtime, the
only fix seemed to be to roll back to 1.4.7 (specifically
1.4.7.dfsg1-6+lenny1) on all vldb/fileserver machines.  The problem
manifested primarily with the fileservers going into continuous salvage
loops anytime a volume operation was done, as well as some very strange
errors in the VolserLog.  Rolling everything to 1.4.7 fixed the issue.
Sadly, I forgot to save the logs prior to rolling back and openafs
overwrites them nearly immediately, so I don't have those for reference
for any good error messages on the main fileservers.

I tested this again this morning with a test fileserver, and continue 
to get an error.

Specifically, when releasing a volume, I get the following error:

This is a complete release of volume 536885604
Cloning RW volume 536885604 to temporary RO...Failed to clone the RW
volume 536885604
: Invalid cross-device link
Error in vos release command.
: Invalid cross-device link

The VolserLog on the fileserver contains:

Wed Aug 20 07:38:46 2009 [5] 1 Volser: ListVolumes: Volume 536885602
(V0536885602.vol) will be destroyed on next salvage
Wed Aug 20 07:38:46 2009 [7] 1 Volser: Delete: volume 536885602 deleted 
Wed Aug 20 07:38:46 2009 [10] 1 Volser: Clone: Cloning volume 536885601
to new volume 536885602
Wed Aug 20 07:38:46 2009 [3] VAttachVolume: Failed to open
/vicepc/V0536885602.vol (errno 2)
Wed Aug 20 07:38:46 2009 [4] 1 Volser: CreateVolume: Unable to create
the volume; aborted, error code 18
Wed Aug 20 07:38:46 2009 [4] : Invalid cross-device link
Wed Aug 20 07:42:06 2009 [7] 1 Volser: CreateVolume: volume 536885604
(bethbtest) created

And so the volume release is unsuccessful.  Google search shows that
this is only likely to happen if there are old volume parts around.
However, this is a brand new volume, and there are no traces of any
similar volumes on any partition.  This happens with the RW on any
partition on the fileserver where the RO is on a different partition
(And yes I know not to do that normally, this is just for testing)

Anyone have any ideas?  I would really like to get everything on 1.4.10
for the performance increases.

Thanks,

-stefan

-- 
Stefan Strandberg
UNIX group
Computer Aided Engineering - UW Madison
stefan@cae.wisc.edu