[OpenAFS] recovering corrupted save on file

Rafael Marco de Lucas rmarco@ifca.unican.es
Thu, 08 Jun 2006 16:21:14 +0200

This is a multi-part message in MIME format.

Content-type: text/plain; charset=us-ascii
Content-transfer-encoding: 7BIT
Content-disposition: inline

I think i have more or less the same problem...
i would like to recover a ton of files from some big afs caches...

is there any easy way like copy the cache into a new AFS server ?

if some tool development for it make sense maybe i would be
able to dedicate some time on it, but i would need a lot of help
since i never had a look inside openafs code,
(it could help for many afs server hardware disaster with lost of data)

i would thank any comment about it,


Content-type: text/plain; charset=ISO-8859-1
Content-transfer-encoding: 7BIT

The cache does not store files it contains blocks.  There is no
guarantee that a file will be entirely contained in the cache.

You do not specify which OS you are using.  On Windows the cache
file is simply a paging file for dedicated virtual memory.  Assuming
the afsd_service.exe is shut down cleanly it would be possible to
write code that could walk the contents of the cache to piece together
the blocks of the file that exist.  However, no such tools currently
exist.  The tool would have to be able to determine the cell, volume,
vnode, and unique values for the file in order to find the correct
stat cache entry and data buffers.

Jeffrey Altman

David Bear wrote:
> We have one room in one building that on occasion has some very
> strange network disruption. It last about 8 minutes. When it clears
> everything works fine.
> The problem is that if someone has a file stored in an afs server open
> on their client, AND they perform a save when the network has its
> tantrum, the file is corrupted; either truncated to 0 bytes or filled
> with garbage.
> Is there any possible way to recover a file like this from the local
> cache manager? I assume the local machine would commit some form of
> the file to the local disk -- stored by the cache manager and then
> actual write to the network would happen later. 
> Of course we are trying to troubleshoot the network -- but at this
> university that's not something that happens very fast... especially
> when the problem is only intermittent.