[OpenAFS] Re: heartbeat and OpenAFS
Brent A Nelson
brent@phys.ufl.edu
Wed, 26 Mar 2003 19:39:42 -0500 (EST)
Well, failover still isn't working.
I'm now using NetInfo to restrict the fileserver to reporting only the
failover address to the volume database. It's kind of awkward in my
setup, as I do want both servers to be always active as database servers,
so I have to start them without the NetInfo file and then move the NetInfo
file into place later before starting fileserver.
However, I seem to have the same problem that was discussed in December
regarding IP aliases on Solaris boxes (see the "multiple network
interfaces in AFS" thread), although my servers are Linux: the fileserver
answers a client request presumably through the main IP address of the
machine rather than the failover IP, and, even though the volume database
only lists the failover address, the client erroneously learns the main IP
of the machine. When I failover that machine, the client complains that
it lost contact with the fileserver.
Has anyone figured out a workaround for this or come up with a patch?
Perhaps if fileserver only binds to the interfaces permitted by
NetInfo/NetRestrict (or, even better, had command-line options for
this).
Thanks,
Brent
On Mon, 17 Mar 2003, Marc Schmitt wrote:
> Hi Brent,
>
> I assume your're using /usr/afs/local/NetInfo and
> /usr/afs/local/NetRestrict to restrict the IP address to the failover IP
> and use the heartbeat IPsrcaddr resource script to make packets appear
> from the failover IP and not from the *real* IP of the HA servers?
>
> Greetz
> Marc
>
> Brent A Nelson wrote:
> > Cool. I shutdown the bosserver on one machine, hexedited the sysid file
> > to match the UUID of the other server, did "vos changeaddr -remove" on the
> > addresses listed under the old UUID (from "vos listaddr -printuuid"),
> > restarted bos, failed-over the file server, and it worked!
> >
> > The server-side fails over very quickly (a few seconds, depending on how
> > long salvage takes), but active clients do experience some sort of timeout
> > of a couple of minutes or so, even with the failover IP address... Any
> > suggestions on how I might further minimize this delay?
> >
> > Thanks,
> >
> > Brent
>
>