[OpenAFS] VLDB corruption
Sat, 8 Nov 2014 13:29:06 -0500
On Sat, 8 Nov 2014 10:52:35 -0500
Michael Meffie <email@example.com> wrote:
> On Sat, 8 Nov 2014 10:58:21 +0200
> Kostas Liakakis <firstname.lastname@example.org> wrote:
> > Hello,
> > Reading about the recent thread for VLDB corruption I decided to take a
> > look at ours, again. vldb_check gives me about 3000 entries likes this:
> > address 1477640 (offset 0x168c48): Free vlentry not on free chain
> > which -fix doesn't seem able to fix.
> > We had several vldb corruption issues in our cell, all caused by a
> > misbehaving 1.4.something server which is now fortunatelly retired. We
> > were able to repair the mess with vldb_check on every occation but this
> > one. We are now running 1.6.10-2 everywhere but we still can't get rid
> > of this 1.4.x herritage...
> > Should we be worried about these errors? There doesn't seem to be a
> > problem so far.
> Hello Kostas,
> I think I see the issue here. The vldb_check -fix does rebuild the
> volume lookup hash tables, but does not rebuild the free list. The
> free list is the list of free slots in the database (holes), which
> the vlserver reuses when allocating new records. If they are not
> in the free list, then the vlserver will just not reuse them, making
> your vldb file larger then needed.
> I'm working on a patch to fix this.
A small fix for vldb_check -fix is in gerrit at http://gerrit.openafs.org/11598
As Andrew mentioned, would you be willing to share a copy of your vldb.DB0 file
Michael Meffie <email@example.com>