[OpenAFS] AFS Fileserver Won't Start
Christopher D. Clausen
cclausen@acm.org
Wed, 3 Oct 2007 22:18:33 -0500
Karl M. Davis <karl@ridgetop-group.com> wrote:
Hi Karl. I'm going to assume it was you in the #openafs IRC channel.
I'd suggest staying logged in if you really want help. You have to wait
for people to have time to respond. And more than the 15 minutes that
you waited. We do need to do things like eat and sleep.
> Somewhere towards the end of moving the volumes from the old server
> to the new server, things got badly goofed. The fs process will no
> longer start on the new server and I find the following entry in the
> /var/log/openafs/FileLog file:
>
> Wed Oct 3 19:26:59 2007 afs_krb_get_lrealm failed, using
> ridgetop-group.local.
Is the above a correct assumption about your Realm? I would expect you
to be using ridgetop-group.com.
> Wed Oct 3 19:26:59 2007 VL_RegisterAddrs rpc failed; The IP address
> exists on a different server; repair it
Check the /etc/hosts file on all machines and all CellServDB files for
incorrect entries.
> Wed Oct 3 19:26:59 2007 VL_RegisterAddrs rpc failed; See VLLog for
> details
What is in VLLog?
> Unfortunately, there's nothing helpful in VLLog. Interestingly, "vos
> listaddrs" returns nothing on the new server, either.
vos listaddrs might not be working b/c of the above errors.
> Running "vos listvldb" returns the following:
> VLDB entries for all servers
> root.afs
> RWrite: 536870915 ROnly: 536870916
> number of sites -> 3
> server picacho.ridgetop-group.local partition /vicepa RW Site
> server picacho.ridgetop-group.local partition /vicepa RO Site
> server picacho.ridgetop-group.local partition /vicepa RO Site
>
> root.cell
> RWrite: 536870918 ROnly: 536870919
> number of sites -> 3
> server picacho.ridgetop-group.local partition /vicepa RW Site
> server picacho.ridgetop-group.local partition /vicepa RO Site
> server picacho.ridgetop-group.local partition /vicepa RO Site
>
> I'm unsure why there are duplicate RO entries, but the last thing I
> was working on was recreating RO volumes for root.cell and root.afs
> on the new server.
Well, it looks like something did not work out right.
> I'm panicking because all of the volumes are now on the new server and
> non-accessible. Anyone have some clue what I did wrong and how I can
> fix things?
Probably going to need more information about what happened, what you
did to try and fix it, and other infrastructure questions, like how many
AFS DB servers you actually have, and if any of them are multi-homed.
<<CDC