[OpenAFS] Re: File contents change when copying into AFS with rsync
Dorian Taylor (Lists)
Wed, 26 Jan 2011 21:01:20 -0800
On 26-Jan-11, at 1:00 PM, Andrew Deason wrote:
> On Wed, 26 Jan 2011 12:45:13 -0800
> "Dorian Taylor (Lists)" <email@example.com> wrote:
>> I've just begun teaching myself about OpenAFS, having installed it
>> (220.127.116.11+dfsg-2, Ubuntu Maverick i386)
> Does this describe both the client and fileserver?
In one case yes (USB drive to /afs on the same Ubuntu machine), in one
case no (OpenAFS 1.5.77 on OSX 10.5 to the Ubuntu machine).
>> The two times I've bulk-loaded data into my new AFS cell with rsync,
>> I've been paranoid enough to generate checksums of the files on both
>> sides. In both several-gigabyte batches there was exactly one file
>> with exactly one four-byte difference. What I'd like hopefully are
>> some ideas for narrowing down if this is an rsync thing, an OpenAFS
>> thing, or something else (cheap memory, gremlins, etc.).
> Was the difference in the same/similar spot both times? Could you
> the actual/expected byte sequence?
By similar spot do you mean offset in the altered file, or offset in
the batch of files?
Unfortunately I already blew away the evidence. The first time in the
interest of expedience, the second before realizing I should probably
worry that it might be a trend.
> ...did you verify the checksum and/or contents on the source before
> _and_ after the transfer?
In the case of the USB->Ubuntu copy I had SHA-256 checksums already
generated over the source for something else. Copying the source over
the errant target yielded the expected checksum (the one I had already
generated). In the case of the Mac->Ubuntu copy I generated MD5
checksums afterward (OSX 10.5 doesn't ship with a sha256sum). Again
when I copied the file from the source (using cp this time), the
target had the correct checksum.
I'm assuming you're trying to infer if the source was somehow
modified; in both cases I'm confident that the change showed up either
in transit or while being saved (hopefully not the latter).
>> Currently my strategy is to try different methods of copying over the
>> next several days (as it will take days) until a clear culprit
> Before doing that, you may want to see whether it is the data on the
> server that is 'bad', or if the clients are reporting different data
> the application. You can do this by trying to read the file after
> flushing it from the cache ('fs flush'), reading from different
> or reading it directly from the vicepX partition on the fileserver
> if you want to know how to do this).
I just recopied the second set (from Mac to Ubuntu) using rsync (-av)
exactly as I did before and this time there was no discrepancy. Must
be a Heisenbug. In any case I will keep my eyes open.
Thanks for your help,
Make things. Make sense.