[OpenAFS] Re: File contents change when copying into AFS with rsync

Dorian Taylor (Lists) dorian.taylor.lists@gmail.com
Wed, 26 Jan 2011 21:01:20 -0800


On 26-Jan-11, at 1:00 PM, Andrew Deason wrote:

> On Wed, 26 Jan 2011 12:45:13 -0800
> "Dorian Taylor (Lists)" <dorian.taylor.lists@gmail.com> wrote:
>
>> I've just begun teaching myself about OpenAFS, having installed it
>> (1.4.12.1+dfsg-2, Ubuntu Maverick i386)
>
> Does this describe both the client and fileserver?

In one case yes (USB drive to /afs on the same Ubuntu machine), in one  
case no (OpenAFS 1.5.77 on OSX 10.5 to the Ubuntu machine).

>
>> The two times I've bulk-loaded data into my new AFS cell with rsync,
>> I've been paranoid enough to generate checksums of the files on both
>> sides. In both several-gigabyte batches there was exactly one file
>> with exactly one four-byte difference. What I'd like hopefully are
>> some ideas for narrowing down if this is an rsync thing, an OpenAFS
>> thing, or something else (cheap memory, gremlins, etc.).
>
> Was the difference in the same/similar spot both times? Could you  
> share
> the actual/expected byte sequence?

By similar spot do you mean offset in the altered file, or offset in  
the batch of files?

Unfortunately I already blew away the evidence. The first time in the  
interest of expedience, the second before realizing I should probably  
worry that it might be a trend.

> ...did you verify the checksum and/or contents on the source before
> _and_ after the transfer?

In the case of the USB->Ubuntu copy I had SHA-256 checksums already  
generated over the source for something else. Copying the source over  
the errant target yielded the expected checksum (the one I had already  
generated). In the case of the Mac->Ubuntu copy I generated MD5  
checksums afterward (OSX 10.5 doesn't ship with a sha256sum). Again  
when I copied the file from the source (using cp this time), the  
target had the correct checksum.

I'm assuming you're trying to infer if the source was somehow  
modified; in both cases I'm confident that the change showed up either  
in transit or while being saved (hopefully not the latter).

>> Currently my strategy is to try different methods of copying over the
>> next several days (as it will take days) until a clear culprit  
>> becomes
>> evident.
>
> Before doing that, you may want to see whether it is the data on the
> server that is 'bad', or if the clients are reporting different data  
> to
> the application. You can do this by trying to read the file after
> flushing it from the cache ('fs flush'), reading from different  
> clients,
> or reading it directly from the vicepX partition on the fileserver  
> (ask
> if you want to know how to do this).


I just recopied the second set (from Mac to Ubuntu) using rsync (-av)  
exactly as I did before and this time there was no discrepancy. Must  
be a Heisenbug. In any case I will keep my eyes open.

Thanks for your help,

--
Dorian Taylor
Make things. Make sense.
http://doriantaylor.com