[OpenAFS] backup - order of entries in volume set

Gunnar Krull gklists@cs.uni-goettingen.de
Fri, 28 Aug 2015 14:58:29 +0200

On 08/28/2015 10:32 AM, Gunnar Krull wrote:
> On 08/28/2015 03:02 AM, Benjamin Kaduk wrote:
>> On Thu, 27 Aug 2015, Gunnar Krull wrote:
>>> Hi,
>>> when I have a backup Volume Set defined in this order, the volume
>>> "user.backup" is not included into the backup:
>>> Volume set userbackup:
>>>      Entry 1: server .*, partition .*, volumes: user\..*\.backup
>>>      Entry 2: server .*, partition .*, volumes: user\.backup
>>> But when I change the order of the two Volume Set entries, the volume
>>> "user.backup" is included:
>>> Volume set userbackup:
>>>      Entry   1: server .*, partition .*, volumes: user\.backup
>>>      Entry   2: server .*, partition .*, volumes: user\..*\.backup
>>> It's strange, but the order is irrelevant for another Volume Set. The
>>> difference to the example above is, that the resulting volume list to be
>>> backuped only consists of two volumes. Namely: "svn.backup" and
>>> "svn.test.backup".
>>> I couldn't find an explanation for this behavior.
>>> Is there something wrong in my understanding of the volume set
>>> definitions?
>> Most likely, there is a bug.
>> Unfortunately, the backup code is some of the least-maintained and worst
>> code in the tree that we still have some expectation of people actually
>> using (which excludes kauth, among other things), so the reason is
>> unlikely to be clear solely from code examination.  To make matters
>> worse,
>> there are a few codepaths that could be taken; in what I think is the
>> common case, the regex is actually evaluated on the vlserver, not on the
>> machine running the backup utility.  (Note that this means that different
>> calls may get different results, if the vlservers are not homogeneous and
>> have different regex libraries on them!)
> All servers are running on Debian Wheezy. So it should be quite
> homogeneous.
> I observed this behavior with OpenAFS 1.6.9 and now also with 1.6.14.
>> As a first debugging step, I would suggest using wireshark or similar to
>> capture traffic between the backup utility and the vlserver(s) to confirm
>> whether the problem exists in the vlserver code or on the client side.
> The Wireshark records show that the backup client gets the complete
> volume list from the vlserver correctly, independent of the order in the
> volume set definition.
> I can see the two requests to the vlserver for both volume set entries
> and their corresponding responses including the volumes correctly.
> But the output and execution of the backup client depends on the order,
> like I described above.
> So, it seems to be the backup client that skips the volume and doesn't
> consider it for the actual backup.

I think the reason for this is a strncmp in "src/bucoord/commands.c" 
line 311:

for (tavols = ps->dupvdlist; add && tavols;
      tavols = tavols->next) {
     if (strncmp(tavols->name, entries[e].name, l) == 0) {
         if ((strcmp(&entries[e].name[l], ".backup") == 0)
             || (strcmp(&entries[e].name[l], ".readonly")
                 == 0)
             || (strcmp(&entries[e].name[l], "") == 0))
             add = 0;

To find and discard duplicate volumes, the list of volumes as a result 
of the current volume set entry is compared to the list of volumes of 
the previous volume set entry. A duplicate volume will not be added to 
the list.

Unfortunately, the strncmp only compares the first l characters of the 
string (l = length of the current volume name).

Also, it compares volume names with and without extensions (e.g .backup):
    tavols->name : with extension (e.g. user.test1.backup)
    entries[e].name : without extension (e.g. user.test1)

This leads to the problem I described above, for example:

Names of volumes: user.backup, user.test1.backup, user.test2.backup, ...

Volume set userbackup:
       Entry 1: server .*, partition .*, volumes: user\..*\.backup
       Entry 2: server .*, partition .*, volumes: user\.backup

Resulting list for Entry 1: user.test1.backup, user.test2.backup, ...
Resulting list for Entry 2: user

The string compare for Entry 2 would be:
   strncmp("user.test1.backup", "user", 4)
This is true and the volume "user.backup" will not be added to the list.
(I don't understand the following if statement with three strcmp ...)

I hope this is correct and the explanation understandable.