[OpenAFS] possible microsoft excel 2010 corruption issue on openafs volumes

Arthur Prokosch arthurp@csail.mit.edu
Wed, 21 Dec 2011 09:15:24 -0500


I've been following the conversation that started with Jonathan
Nilsson's:
> We've had some strange file corruption and missing data issues with
> Microsoft Excel 2010 on a Windows 7 OpenAFS 1.5.78 client.
>
> 1) A user creates an Excel file on their local drive, then copies it
> to an AFS RW mount point (via mapped network drive).
> 2) several clients read the file (no changes made)
> 3) at some point the file is no longer able to be opened, or it opens
> but warns of corruption and missing data
>
> When I recover the file from the backup on the day of the file's
> modification timestamp, the restored file is fine. Curiously, the
> restored file appears identical to the corrupted file: the file size (bytes)
> and timestamp are identical.

We've experienced corruption that sounds very similar at our site.
The volume with collaborative excel usage was migrated to a 1.6.0
server sometime between 11/10 and 11/23, then back to 1.4.12.1 on
12/9. Corrupt excel files were discovered starting 11/30, most
recently 12/19.

I was at first baffled by a recent example: a corrupt file with a
timestamp (mtime) of 11/8 - days before the earliest that the volume
could have been migrated to 1.6.0.  Then I reread the above report
more closely, and realized that if the corruption was happening in
step 2), then perhaps no writes need to be sent by the client for
corruption to happen, so mtimes might be irrelevant.

Jonathan, or others who experienced this bug -- have you seen corrupt
excel files with modification times from before when a volume was
served by 1.6.0?

jaltman,Derrick -- is my above reconstruction plausible?

If not, I don't look forwarding to finding another explanation for 
"Excel could not open <FILENAME> because some content is
unreadable. Do you want to open and repair this workbook?"
followed by
"The workbook cannot be opened or repaired by Microsoft Excel because
it is corrupt."
:-(

thanks!
-arthur prokosch
system administrator
MIT Computer Science and Artificial Intelligence Lab.