[OpenAFS] Re: vldb_check -servers cleanup and empty server entry messages

Andrew Deason adeason@sinenomine.net
Thu, 4 Mar 2010 11:24:13 -0600


On Thu, 25 Feb 2010 16:08:46 -0500
"John W. Sopko Jr." <sopko@cs.unc.edu> wrote:

> % vldb_check /usr/afs/db/vldb.DB0 -servers |& head -40
> VLDB_CHECK_WARNING: Ubik header size is 0 (should be 64)
> MH block 0, index 1: 152.2.128.4
> MH block 0, index 3: 152.2.128.3
> MH block 0, index 4: 152.2.129.145
> MH block 0, index 31: 152.2.129.25
> MH block 0, index 32: 152.2.128.34
>     Server ip addr 4 = 152.2.128.157
>     Server ip addr 5 = 152.2.128.161
[...]
> The entries like "Server ip addr ..." our not file or db servers and in
> some cases are not in DNS. These IPs are owned by us but our not
> file servers.

The entries like 'Server ip addr ...' are non-multihomed entries. All
modern OpenAFS fileservers will register themselves as multihomed, so
non-multihomed entries will typically be a result of either a really old
fileserver, or someone running 'vos changeaddr' without -remove.

Really old fileservers aren't really anything to worry about; they just
never go away in the database unless you delete them (as you did). It
could also be someone doing a 'vos changeaddr', though; make sure you
don't use that command without the -remove flag (as documented, using it
except to remove addresses isn't a good idea).

> Previous to this vldb_check I ran another check and there were 4
> "server ip addr ..." entries that were not in the IP range we own or
> DNS.  I used "vos changeadress 128.109.136.161 -remove" to remove the
> 4 entries.
> 
> I then ran vldb_check /usr/afs/db/vldb.DB0 -servers again, the entries
> were removed but I now get a bunch of "empty server entry 0" entries!
> A sample of a few entries is shown below. The volumes that are
> displayed seem to be fine. It is making me nervous!

This is nothing to worry about; this is a bug in vldb_check. vldb_check
doesn't read in the entries properly, so it thinks server numbers are
always 0. If you have a server in index 0, it doesn't think there's a
problem.  If you don't have a server in index 0, it thinks the entry is
pointing to something nonexistent, which is a problem. You happened to
delete the server entry in slot 0, so now it's whining at you. But
there's nothing wrong with doing that; vldb_check just isn't reading the
entries right.

I just filed ticket 126661 about this; follow it if you want the fix.
But for now, it's nothing to worry about.

-- 
Andrew Deason
adeason@sinenomine.net