[OpenAFS] Re: servers not establishing a quorum

Andrew Deason adeason@sinenomine.net
Tue, 6 Apr 2010 13:37:49 -0500


On Tue, 6 Apr 2010 13:56:50 -0400 (EDT)
lists@drewstud.com wrote:

> awesome.
> This may help as well:

> we have afs "pairs" at each location. We are syncing them with
> heartbeat/drbd.

Trying to do that with dbservers seems overkill, but okay. So you have a
hot-spare thata starts up bosserver when the other node goes down, I
assume?

> We have tried to get it to only "show" the one floating vip via
> NetInfo

I haven't been thinking about the cluster-HA AFS thing recently, but I'm
not sure how necessary that is. Fileservers will register what addresses
they have on startup, so if the local IP is registered in the VLDB on
one fileserver, and it goes down and the other server comes up, the old
local IP should go away. If/when clients re-read VLDB information, they
won't get the IP for the downed fileserver.

> VLLog
> Tue Apr  6 13:23:37 2010 ubik: primary address 172.20.1.26 does not exist
> Tue Apr  6 13:23:37 2010 Using 172.20.125.226 as my primary address
> Contents of NetInfo:
> 172.20.125.226

That will work for fileservers, but I think for dbservers that's going
to cause problems like the one you're seeing. When 10.138.8.160 gets a
ping from 172.20.1.26, it doesn't know which site in the quorum it
corresponds to, since you told 172.20.1.26 not to advertise the
172.20.1.26 address. Preferably for dbservers you would not specify
anything in that file.

Alternatively, the easiest way for you to solve this would probably be
to just route outgoing packets such that they originate from
172.20.125.226 instead of 172.20.1.26 (enabled with some heartbeat
script). Would that be possible?

-- 
Andrew Deason
adeason@sinenomine.net