[OpenAFS] backup - order of entries in volume set
Gunnar Krull
gklists@cs.uni-goettingen.de
Fri, 28 Aug 2015 14:58:29 +0200
On 08/28/2015 10:32 AM, Gunnar Krull wrote:
> On 08/28/2015 03:02 AM, Benjamin Kaduk wrote:
>> On Thu, 27 Aug 2015, Gunnar Krull wrote:
>>
>>> Hi,
>>>
>>> when I have a backup Volume Set defined in this order, the volume
>>> "user.backup" is not included into the backup:
>>>
>>> Volume set userbackup:
>>> Entry 1: server .*, partition .*, volumes: user\..*\.backup
>>> Entry 2: server .*, partition .*, volumes: user\.backup
>>>
>>> But when I change the order of the two Volume Set entries, the volume
>>> "user.backup" is included:
>>>
>>> Volume set userbackup:
>>> Entry 1: server .*, partition .*, volumes: user\.backup
>>> Entry 2: server .*, partition .*, volumes: user\..*\.backup
>>>
>>>
>>> It's strange, but the order is irrelevant for another Volume Set. The
>>> difference to the example above is, that the resulting volume list to be
>>> backuped only consists of two volumes. Namely: "svn.backup" and
>>> "svn.test.backup".
>>>
>>> I couldn't find an explanation for this behavior.
>>> Is there something wrong in my understanding of the volume set
>>> definitions?
>>
>> Most likely, there is a bug.
>>
>> Unfortunately, the backup code is some of the least-maintained and worst
>> code in the tree that we still have some expectation of people actually
>> using (which excludes kauth, among other things), so the reason is
>> unlikely to be clear solely from code examination. To make matters
>> worse,
>> there are a few codepaths that could be taken; in what I think is the
>> common case, the regex is actually evaluated on the vlserver, not on the
>> machine running the backup utility. (Note that this means that different
>> calls may get different results, if the vlservers are not homogeneous and
>> have different regex libraries on them!)
>
> All servers are running on Debian Wheezy. So it should be quite
> homogeneous.
> I observed this behavior with OpenAFS 1.6.9 and now also with 1.6.14.
>
>>
>> As a first debugging step, I would suggest using wireshark or similar to
>> capture traffic between the backup utility and the vlserver(s) to confirm
>> whether the problem exists in the vlserver code or on the client side.
>
> The Wireshark records show that the backup client gets the complete
> volume list from the vlserver correctly, independent of the order in the
> volume set definition.
> I can see the two requests to the vlserver for both volume set entries
> and their corresponding responses including the volumes correctly.
>
> But the output and execution of the backup client depends on the order,
> like I described above.
>
> So, it seems to be the backup client that skips the volume and doesn't
> consider it for the actual backup.
I think the reason for this is a strncmp in "src/bucoord/commands.c"
line 311:
for (tavols = ps->dupvdlist; add && tavols;
tavols = tavols->next) {
if (strncmp(tavols->name, entries[e].name, l) == 0) {
if ((strcmp(&entries[e].name[l], ".backup") == 0)
|| (strcmp(&entries[e].name[l], ".readonly")
== 0)
|| (strcmp(&entries[e].name[l], "") == 0))
add = 0;
}
}
To find and discard duplicate volumes, the list of volumes as a result
of the current volume set entry is compared to the list of volumes of
the previous volume set entry. A duplicate volume will not be added to
the list.
Unfortunately, the strncmp only compares the first l characters of the
string (l = length of the current volume name).
Also, it compares volume names with and without extensions (e.g .backup):
tavols->name : with extension (e.g. user.test1.backup)
entries[e].name : without extension (e.g. user.test1)
This leads to the problem I described above, for example:
Names of volumes: user.backup, user.test1.backup, user.test2.backup, ...
Volume set userbackup:
Entry 1: server .*, partition .*, volumes: user\..*\.backup
Entry 2: server .*, partition .*, volumes: user\.backup
Resulting list for Entry 1: user.test1.backup, user.test2.backup, ...
Resulting list for Entry 2: user
The string compare for Entry 2 would be:
strncmp("user.test1.backup", "user", 4)
This is true and the volume "user.backup" will not be added to the list.
(I don't understand the following if statement with three strcmp ...)
I hope this is correct and the explanation understandable.
Regards,
Gunnar