[OpenAFS-devel] Patches for Openafs compression support

Jeffrey Hutzelman jhutz@cmu.edu
Thu, 06 Jan 2005 13:40:09 -0500


On Thursday, January 06, 2005 18:55:19 +0100 Peter Somogyi 
<psomogyi@gamax.hu> wrote:

> 1. Dump file is decompressed right after receiving. Only the network
> traffic  is compressed at "vos dump" and "vos restore". The dump file you
> get is not  compressed.
> So that I think compressing files one-by-one is unnecessary.

I said nothing about compressing files one-by-one.  I said that in a 
compressed dump format (which I prefer to compression-on-the-wire), the 
compressed portion of the dump should start with the first vnode, after the 
dump and volume headers.  This would allow the majority of existing tools 
that peek inside dump headers to continue to work with compressed dumps 
without modification.

> 2. Now I've examined exactly in the afs source how the encryption flag
> works
>
> From the user's aspect: there's a flag for some vos commands "-encrypt".
> For  compression, now there's a flag "-z" (and -bz2). So that the usage
> is the  same.
> 	I think this design is almost the best _for the users_. (Exception: I
> still  assume in the patch that all servers of the given volume's sites
> support  compression at "vos release" if compression is given. Perhaps I
> could refine  it later.)

I did not suggest behaviour similar to the -encrypt flag.  I suggested that 
rather than compress-on-the-wire, the dump format be extended, and that the 
volserver automatically generate compressed-format dumps when configured to 
do so.  This configuration would be controlled by a command-line switch to 
the volserver, not to vos.  Such an approach would not require changes to 
vos or any other administrative client.  In particular, it would permit 
administrators to continue dumping, restoring, and moving volumes using 
whatever tools they are accustomed to, and without having to know whether 
the servers involved support compression in order to take advantage of the 
new functionality.

I do not believe there needs to be _any_ exposed user interface for this 
feature, and I believe you are going through a lot of effort making 
pervasive API changes to provide a UI that is completely unnecessary.


> 3. It's a good idea to "autodetect" compression from the header.  Then I
> would  need at least one new function "GetCapabilities" in the interface.
> One problem: it is good only for restore/dump.
> But it doesn't help for "vos release", since it calls "VolForward" to
> multiple  servers (for multiple targets), and the compression flag MUST
> be given to the  other servers if we want to make them compress...
> So that at least "VolForward[Multiple]Z" is necessary.
> But this modification wouldn't help the user, just harder to implement:)

I'm aware of how volume moves work.  It is not the case that a special 
VolForwardZ would be necessary to control whether moves were done using 
compression.  Instead, I would expect the source volserver (the one to 
which you make the VolForward call) to determine the capabilities of each 
target server and behave appropriately, without user intervention.  Again, 
the client should not need to know or care whether compressed dumps are 
being used.

Note that this mechanism is not strictly required -- correct behaviour can 
be obtained simply by not turning on the generation of compressed dumps 
until all servers in the cell are new enough to support them.  However, I 
expect that transition to using the new functionality would be eased by 
allowing AFSVolForward to do automatic capability detection.  This would, 
for example, allow a site to deploy a test server using the new compression 
code, and easily move volumes onto and off of the test server.


> Thanks for the info. I think it will be actual only after the interface
> (with  the patch) gets accepted.

The standard approach is to request and receive new procedure numbers 
before submitting a patch, rather than submitting a patch which uses 
made-up or private-use numbers.  This prevents the unfortunate situation in 
which there is widespread use of private-use numbers for non-private uses, 
which has a number of bad side-effects:

- Interoperability problems when two different escaped experiments
  assign different meanings to the same numbers.
- Inability to actually use the private-use numbers for private uses,
  due to interop problems with some escaped experiment.
- The need for production code to support the "wrong" numbers in order
  to insure interoperability with escaped experiments, thereby making
  the first two problems worse.

Rx RPC numbers are 32 bits, so the available namespace is quite large.
Request numbers, and they will be assigned.

> NOTE: I can see some new functions already existing in the interface, too:
># define     VOLCONVERTRO        65536
># define     VOLGETSIZE          65537

Yup, I know about those -- I assigned them.

-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
   Sr. Research Systems Programmer
   School of Computer Science - Research Computing Facility
   Carnegie Mellon University - Pittsburgh, PA