[OpenAFS] Re: IBM AFS to OpenAFS upgrade complete... problems continue

Andrew Deason adeason@sinenomine.net
Tue, 4 May 2010 11:09:11 -0500


On Tue, 4 May 2010 08:45:23 +0300 (EEST)
Atro Tossavainen <atro.tossavainen+openafs@helsinki.fi> wrote:

> > My guess is that this would be the result of using 'vos changeaddr',
> > or at least something with MH vs non-MH hosts.
> 
> MH, non-MH?

MH for 'multihomed'. "Old" VLDB server entries just map a server number
to a server IP. A long time ago the VLDB format was extended to allow
multiple IPs per server index, so MH entries map a server index to a
UUID and a list of IPs.

All modern fileservers register themselves as MH records in the VLDB.
'vos changeaddr' creates a non-MH record.

> Yes, I did have to use vos changeaddr when I changed the new file
> servers from their temporary IP addresses to the real ones in order
> to see any volumes at all.  I think I've reported that here.

You shouldn't need to. Modern fileservers will register their addresses
in the VLDB by themselves on startup. Using 'vos changeaddr' to do
anything besides deleting an address can make problems that are annoying
to fix.

> root@bond / 17 # vldb_check -database /usr/afs/db/vldb.DB0 -servers
> VLDB_CHECK_WARNING: Ubik header size is 0 (should be 64)
> MH block 0, index 1: 128.214.58.174
> MH block 0, index 2: 128.214.88.114
>    Server ip addr 0 = MH block 0, index 1
>    Server ip addr 1 = MH block 0, index 2
>    Server ip addr 2 = 128.214.58.174
>    Server ip addr 3 = 128.214.88.114
[...]
> > With the output of running 'vldb_check -servers' on your VLDB, I'd be
> > able to tell.
> 
> All ears.  I can't see the obvious fault above.

"server ip addr 0" and "server ip addr 2" are effectively duplicates (as
are 1/3). The vlserver will return an error in this case, when we try to
register a server that has a matching MH and a matching non-MH entry.

> > VLLog contains nothing on _all_ dbservers?
> 
> That's right.  Here they are.

Oops, yeah, sorry. You don't see the error messages I'm thinking of
unless you're at debug level 5 or higher (run vlserver with '-d 5', or
send it two 'kill -TSTP's). But I think I know what it will say. It will
also suggest you to either 'vos changeaddr' the old address, or to
delete your sysid file, which is bad advice in this case, so it's
possibly better the messages didn't appear :)

The messages you're seeing about failing to register the fileserver
addresses I _think_ are harmless for the moment, since 'server ip addr'
2 and 3 point to the correct IP anyway (that is, I don't think that
would be causing your connectivity problems). But it will cause problems
if you move your fileserver or change its IPs.

One way of fixing this is with 'vos changeloc's or 'vos sync*'s, but I
need to look again to make sure...

-- 
Andrew Deason
adeason@sinenomine.net