[OpenAFS-devel] X86_64 / rc6 problem in volume move / addsite (rx call failed: 1492325122)

Marco Hoehle MHO@zurich.ibm.com
Thu, 6 Oct 2005 12:10:55 +0200


Hi,

in the last days we encountered a problem with RC6 on X86_64 Fileserver
Hosts.
Situation : 2 FileServers with largefile-fileserver enabled. 1 on i586 and
the other one is x86_64 architecture.
I have 70 volumes 3 of them could not be moved from the i586 to the x86_64
architecture. Today i tried adding a replication site with my home volume
and got the same behaviour which makes the x86_64 fileserver more ore less
unusable.
Below you can see the error messages. It is telling, that the volume is a
badly formatted dump. I tested with volinfo  but it is saying nothing bad.
This problem occurs only if we're moving the stuff TO the x86_64 host. I
tried it with another i586 host - no problem. So my decision is to
reinstall the x86_64 with
a 32bit OS and not to use the 64bit code :) But I thought it may be
important for you to know, that there is maybe a problem in the rc6 -
x86_64 code.
I'll keep one dump as filebased dump - if you're interested in debugging
this one.

The History of the volume: it was created i belive on transarc (?nik?),
later moved to openafs-1.2.13 (AIX). Now we moved it to openafs-1.4.0-rc6
(SLES9/i586) and we want to
move it to (SLES9 / X86_64).


-------------------- vos move ------------------------
brandis:~ # vos move tmp.wbk brandis /vicepa thierstein /vicepa -v
Starting transaction on source volume 536872371 ... done
Allocating new volume id for clone of volume 536872371 ... done
Cloning source volume 536872371 ... done
Ending the transaction on the source volume 536872371 ... done
Starting transaction on the cloned volume 536898359 ... done
Setting flags on cloned volume 536898359 ... done
Getting status of cloned volume 536898359 ... done
Creating the destination volume 536872371 ... done
Setting volume flags on destination volume 536872371 ... done
Dumping from clone 536898359 on source to volume 536872371 on destination ...
Failed to move data for the volume 536872371
   VOLSER: Problems encountered in doing the dump !
vos move: operation interrupted, cleanup in progress...
clear transaction contexts
Recovery: Releasing VLDB lock on volume 536872371 ... done
Recovery: Ending transaction on clone volume ... done
Recovery: Ending transaction on destination volume ... done
Recovery: Accessing VLDB.
move incomplete - attempt cleanup of target partition - no guarantee
Recovery: Creating transaction for destination volume 536872371 ... done
Recovery: Setting flags on destination volume 536872371 ... done
Recovery: Deleting destination volume 536872371 ... done
Recovery: Ending transaction on destination volume 536872371 ... done
Recovery: Creating transaction on source volume 536872371 ... done
Recovery: Setting flags on source volume 536872371 ... done
Recovery: Ending transaction on source volume 536872371 ... done
Recovery: Creating transaction on clone volume 536898359 ... done
Recovery: Deleting clone volume 536898359 ... done
Recovery: Ending transaction on clone volume 536898359 ... done
Recovery: Releasing lock on VLDB entry for volume 536872371 ... done
cleanup complete - user verify desired result
---------------------------------------------------------------

------------------ VolServ.log on i586 (brandis, source server) ---------------------
Thu Oct  6 11:46:21 2005 1 Volser: Clone: Cloning volume 536872371 to new volume 536898359
Thu Oct  6 11:47:37 2005 1 Volser: DumpVolume: Rx call failed during dump, error 1492325122
Thu Oct  6 11:48:42 2005 1 Volser: Delete: volume 536898359 deleted
---------------------------------------------------------------

The VolServ.log on the target fileserver says nothing, only create volume, remove volume.

---------------------------------------------------------------------------------------
 volinfo -volumeid tmp.wbk -header > volinfo.txt
(btw: the volinfo -volumeid does NOT list the volumeid only, but the whole partition)

Volume header for volume 536872371 (tmp.wbk)
stamp.magic = 78a1b2c5, stamp.version = 1
inUse = 1, inService = 1, blessed = 1, needsSalvaged = 0, dontSalvage = 229
type = 0 (read/write), uniquifier = 739423, needsCallback = 0, destroyMe = 0
id = 536872371, parentId = 536872371, cloneId = 536898359, backupId = 0, restoredFromId = 0
maxquota = 3000000, minquota = 0, maxfiles = 0, filecount = 77401, diskused = 2450215
creationDate = 709804318 (1992/06/29.09:51:58), copyDate = 1128521771 (2005/10/05.16:16:11)
backupDate = 1032577714 (2002/09/21.05:08:34), expirationDate = 0 (1970/01/01.01:00:00)
accessDate = 0 (1970/01/01.01:00:00), updateDate = 1110919076 (2005/03/15.21:37:56)
owner = 20484, accountNumber = 0
dayUse = 110; week = (0, 0, 0, 0, 0, 0, 0), dayUseDate = 1128463200 (2005/10/05.00:00:00)
Volume header (size = 76):
      stamp = 0x1
      VolId = 536898350
      parent      = 536898324
      Info inode  = 2305960854660579327 (size = 552)
      Small inode = 2305960854727688191 (size = 16448)
      Large inode = 2305960854794797055 (size = 512)
Total aux volume size = 17588

      Link inode  = 2305960743326973951 (size = 14)
Total aux volume size = 17602

Inode 2305960854660579327: Good magic 78a1b2c5 and version 1
Inode 2305960854727688191: Good magic 99776655 and version 1
Inode 2305960854794797055: Good magic 88664433 and version 1
Inode 2305960743326973951: Good magic 99877712 and version 1
Volume header for volume 536898350 (move-clone-temp)
stamp.magic = 78a1b2c5, stamp.version = 1
inUse = 0, inService = 0, blessed = 1, needsSalvaged = 0, dontSalvage = 0
type = 1 (readonly), uniquifier = 6, needsCallback = 0, destroyMe = d3
id = 536898350, parentId = 536898324, cloneId = 536898350, backupId = 0, restoredFromId = 0
maxquota = 4000000, minquota = 0, maxfiles = 0, filecount = 1, diskused = 3000002
creationDate = 1128524802 (2005/10/05.17:06:42), copyDate = 1128524802 (2005/10/05.17:06:42)
backupDate = 0 (1970/01/01.01:00:00), expirationDate = 0 (1970/01/01.01:00:00)
accessDate = 0 (1970/01/01.01:00:00), updateDate = 1128524733 (2005/10/05.17:05:33)
owner = 0, accountNumber = 0
dayUse = 0; week = (0, 0, 0, 0, 0, 0, 0), dayUseDate = 0 (1970/01/01.01:00:00)
-------------------------------------------------------------------------

Thanks for your interest.

Regards marco

P.S.: I already wrote a mail to Derrick, but got no response. If you're interested in the SLES9 binary rpms of the latest openafs code
        for i586/x86_64/ppc64 (power5) please let me know ... but anyway i've seen, that in the opensuse 10.x kernel rpm the libafs for 1.3.85 is
        integrated.