[OpenAFS] VLDB corruption

Michael Meffie mmeffie@sinenomine.net
Sat, 8 Nov 2014 10:52:35 -0500


On Sat, 8 Nov 2014 10:58:21 +0200
Kostas Liakakis <kostas@physics.auth.gr> wrote:

> Hello,
> 
> Reading about the recent thread for VLDB corruption I decided to take a 
> look at ours, again. vldb_check gives me about 3000 entries likes this:
> 
> address 1477640 (offset 0x168c48): Free vlentry not on free chain
> 
> which -fix doesn't seem able to fix.
> 
> We had several vldb corruption issues in our cell, all caused by a 
> misbehaving 1.4.something server which is now fortunatelly retired. We 
> were able to repair the mess with vldb_check on every occation but this 
> one. We are now running 1.6.10-2 everywhere but we still can't get rid 
> of this 1.4.x herritage...
> 
> Should we be worried about these errors? There doesn't seem to be a 
> problem so far.

Hello Kostas,

I think I see the issue here. The vldb_check -fix does rebuild the
volume lookup hash tables, but does not rebuild the free list. The
free list is the list of free slots in the database (holes), which
the vlserver reuses when allocating new records. If they are not
in the free list, then the vlserver will just not reuse them, making
your vldb file larger then needed.

I'm working on a patch to fix this.

Thank,
Mike



> 
> I see a -dumpvldb option on vldb_convert, but that utility seems to 
> depend on 'cnvldb' which is nowhere to be found in the default binary 
> installation.
> 
> Thanks,
> 
> -Kostas
> 


-- 
Michael Meffie <mmeffie@sinenomine.net>