[OpenAFS-devel] File size limit exceeded

Jeffrey Hutzelman jhutz@cmu.edu
Tue, 10 Feb 2004 12:08:27 -0500


On Wednesday, February 11, 2004 00:12:48 +0800 Raymond Wong 
<raymond@lifewood.com> wrote:

>
> On Tue, 10 Feb 2004, Derek Atkins wrote:
>
>> I don't understand how one is related to another.  You can dump
>> a >2GB AFS volume into a resierfs file, but you cannot create
>> a file >2GB in AFS.
>
> The problem is that we can't create a >2GB dump file, not create a >2GB
> file in AFS.  We have a volume of 20GB and we can't dump it out.

Offhand, I'd guess you're using a command like

vos dump some.volume > dumpfile

This will never work for a large volume, because when you use I/O 
redirection the shell opens the output file, and it always opens it in 
32-bit mode because it doesn't know what the program that will be doing 
output is capable of.  Instead, you should use a command like

vos dump -id some.volume -file dumpfile

Which will cause vos to open the file in 64-bit mode.


> We have two AFS machines.  If the first AFS in Linux is upgraded to 1.2.11
> and the second AFS server in XP is upgraded to 1.2.10.

You're not being clear here.  Are these the versions you're running now, or 
are you asking what will happen if you upgrade to those versions?  If the 
latter, you need to tell us what version you're upgrading _from_ in order 
to get a correct answer.

> Does the quorum
> bug still exist?

The quorum bug exists in all versions of OpenAFS prior to 1.2.11, in 
OpenAFS 1.3.x prior to 1.3.52, and in all versions of IBM AFS prior to 3.6 
patch 9.1, except that a binary patch is available from IBM for 3.6 patch 9.

It is caused by a bug in the alogorithm used by the elected coordinator to 
determine how long it may remain sync site.  The bug affects only the 
ptserver, vlserver, kaserver, and buserver, and installations of more than 
one dbserver in which the elected coordinator is running a version of the 
software containing the bug.  The versions of other components and of 
servers other than the elected coordinator are not relevant.  If you have 
only two dbservers, the one you need to upgrade is the one with the 
numerically-lower IP address.

> Do the AFS volume still accessible using newer version
> of software.

I think you're asking "if I upgrade, will I lose all my data?".  The answer 
to that depends on the platform, the version you're upgrading from, and the 
particular tool you're upgrading:

- There has not been a backward-incompatible change in the format of the 
ptserver and kaserver databases since at least before AFS 3.2; I suspect 
there has not been such a change in the lifetime of AFS as a commercial 
product.

- There has not been a backward-incompatible change in the format of the 
vlserver database since before AFS 3.4 (I think the last such change was in 
3.3a, but I'm not positive).

- I can't claim full knowledge on this, but I'm not aware of any 
backward-incompatible change in the buserver database in several IBM AFS 
versions.

- No version of OpenAFS has ever introduced a backward-incompatible change 
in the formats of any of the ubik-managed databases.  So, if you are 
running IBM AFS 3.4 or later, or any version of OpenAFS, you should be able 
to safely upgrade your ptserver, vlserver, kaserver, buserver at any time.

- I do not believe there has been a backward-incompatible change in the 
namei volume storage format in the lifetime of the Linux fileserver port. 
So, it should be safe to upgrade any Linux fileserver to a new version.

- I can't say how long the Windows server port has been stable, or what 
version it might be safe to upgrade from/to.  However, I can point out that 
at present OpenAFS does not support fileservers on Windows, and the server 
components are not included in the binary packages available on openafs.org


> We are hesitated to upgrade to a higher version because we don't
> understand the quorum problem.  We have stored nearly 200GB
> enterprise files in the old version AFS and a proper migration process has
> to be carried out carefully.


OpenAFS 1.2.11 contains only the following updates:
- build support for MacOS 10.3
- a fix for a configure problem on sun4x_57
- the quorum fix
- a fix for a relatively uncommon data corruption problem on Linux 
fileservers.

The first two don't affect your platform, and the last two are things you 
probably want.  In any event, it is perfectly safe to upgrade from OpenAFS 
1.2.10 to 1.2.11, or to apply IBM's binary fixes to IBM AFS 3.6 patch 9.

The quorum problem affects only the database servers.  If you are concerned 
about the safety of your volume data, don't upgrade the fileserver or 
related tools -- upgrade only the ptserver, vlserver, kaserver, buserver.

-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
   Sr. Research Systems Programmer
   School of Computer Science - Research Computing Facility
   Carnegie Mellon University - Pittsburgh, PA