[OpenAFS] fileserver not registering correctly

Erwin Broschinski broschi@id.ethz.ch
Mon, 30 Aug 2004 14:02:16 +0200 (MEST)


I am running  OpenAFS 1.2.11 on all AFS-servers under Solaris 8.
Two recently upgraded fileservers behave strangely:

In order to track the problem down, I evacuated the volumes and then:

bos stop server-n fs   on both of them
vos changeaddr server-n -remove   (only one entry was found in VLDB)

Bos Restarted all DB-Servers, to make sure no trace of the two file servers was
left in VLDB

vos listaddrs  showed none of the two.

And now, here comes the mystery:

bos start server-1 fs
OK in vos listaddrs

bos start server-2 fs
vos listaddrs then shows server-2 but server-1 is gone :*o

VLLog on the sync site DB-Server showed (Set Debug On level = 5):

Mon Aug 30 12:17:50 2004 The following fileserver is being registered in the
Mon Aug 30 12:17:50 2004 allbetter checking
   It will replace the following existing entry in the VLDB (same uuid):
      entry 3: []

How is it possible, these have the same uuid?

If I do a bos restart server-1, vos listaddrs shows server-1 only! And
IP-addresses in VLLog reverse.

Both servers have different MAC addresses, different routers are involved:
(arp -a server-n)

hme0   rou-hg-1-id-afs-cla.ethz.ch       00:0d:66:2d:f8:00
hme0   nethzafs-004.ethz.ch SP    08:00:20:a7:ba:8e


ge0    rou-rz-1-service-id-afs.ethz.ch       00:d0:00:02:5c:00
ge0    nethzafs-002.ethz.ch SP    08:00:20:9f:5e:7e

and different IP-addresses (as you see in VLLog), I checked netmask and default
route. Communication guys say, there is no Proxy ARP.

Any idea what I can do?


