[OpenAFS-devel] Patches for Openafs compression support

Jeffrey Hutzelman jhutz@cmu.edu
Thu, 06 Jan 2005 18:12:18 -0500


On Thursday, January 06, 2005 21:06:50 +0100 Peter Somogyi 
<psomogyi@gamax.hu> wrote:

> I have the following problems/questions with this approach:
> - is it good to use the same compression level for storing and
> transmitting  data?
> - is it good to specify the same compression settings (whether to
> compress,  and level) for a whole volserver process? For example: when
> some of the  volumes on that server has RO sites with fast connection,
> and there are some  others with slow connection to RO sites? (Perhaps it
> would require a new  config file or database where a table describes for
> each volume whether to  use compression and level)

In general, I believe that being able to specify the level of compression 
used is usually more trouble than it is worth.  It is necessary to be able 
to specify the type of compression used (gzip vs bzip2 vs none), but I 
don't believe it is necessary to do so on a finer granularity than a whole 
volserver.

Note that the question of how finely-grained the configuration can be is 
largely an implementation issue, and is easy to change or extend without 
significantly affecting interoperability.  Therefore it's not critical that 
we get it right the first time...

There are a couple of potentially common cases where using compressed dumps 
may be harmful even though both ends support it, and I think these are 
worth considering.

The first is where your network is fast enough or your CPU's slow or loaded 
enough that using compressed dumps results in increased CPU load which is 
not outweighed by the reduction in network bandwidth used.  This is very 
likely to be a function of either the sending or receiving volserver and/or 
their network links.  I believe these cases can be addressed by 
volserver-level configuration; in particular (1) a volserver can be 
configured not to ever send compressed dumps, and (2) a volserver can be 
configured not to advertise support for compressed dumps in its 
capabilities, even though it is capable of decoding them.  In your example 
of a case where some RO servers are behind fast links, those servers could 
be configured not to advertise support for compressed dumps, and would 
therefore receive uncompressed dumps during volume moves or releases. 
Howewver, they would still be able to decode a compressed dump if sent one 
by 'vos restore'.

The second potentially-harmful case is where the contents of a particular 
volume are mostly or entirely already-compressed data, in which case a 
compressed dump would likely be larger than an uncompresed dump, or would 
at least not be significantly smaller.  To deal with these cases, I would 
propose a flag which could be set in a volume header, indicating that this 
dumps of this volume should not be compressed.  This flag would move with 
the volume (obviously, we'd have to invent a new dump tag for it), which 
IMHO gives you better behaviour than you'd get from a configuration file.

> - I don't understand how do you mean the auto-detection. Could you
> explain in  more details how/when/where to deploy the test server?

The idea is that when moving or releasing a volume, the administrative 
client simply calls AFSVolForward(), and it is the sending volserver's job 
to query the capabilities of the receiving server to determine whether it 
can handle compressed dumps.  So, suppose I add a new test server to my 
cell, running the new software with compressed-dump support.  I can move a 
volume onto this server; in this direction the dump will be uncompressed 
because the sending server (a production server in the same cell) does not 
know how to generate compressed dumps.  Now that my volume is on the test 
server, it is getting real load, the volume is probably being dumped every 
day by the backup system (good for making sure the backup system doesn't 
choke on compressed dumps), and so on.

When I'm done testing, I then move the volume back.  In this direction, the 
sending volserver is my test server.  It supports compressed dumps, so when 
I call AFSVolForward, it automatically queries the capabilities of the 
receiving server (again, one of my production servers).  Since the old 
server does not support compressed dumps, a normal uncompressed dump is 
generated.  If I were moving to a new server which did support compressed 
dumps, then a compressed dump would be generated and sent to the new server.

Obviously, this only helps with VolForward, where one volserver is talking 
directly to another.  If you used vos dump to dump a volume from a server 
supporting compressed dumps and then wanted to restore it to a server that 
did not, you'd have to somehow uncompress the dump first.  I don't see this 
as a major problem, particularly since I expect to provide a tool which can 
be used to convert between compressed and uncompressed dumps.

-- Jeffrey T. Hutzelman (N3NHS) <jhutz+@cmu.edu>
   Sr. Research Systems Programmer
   School of Computer Science - Research Computing Facility
   Carnegie Mellon University - Pittsburgh, PA