[OpenAFS] Weird Quorum Issues
Hartmut Reuter
reuter@rzg.mpg.de
Thu, 06 Nov 2003 16:31:31 +0100
Aaron Stanley wrote:
> Some additional information for your consideration now that I'm back at the
> office:
>
> Output of udebug <server> 7000
> Return code -1 from VOTE_Debug
>
> Errors in FileLog:
> VL_RegisterAddrs rpc failed; will retry periodically (code=5376, err=4)
>
> The above error showed up on all my servers but has now stopped (last
> reported error that I can see was ~3am this morning). I still, however, get
> the on/off quorum. I was able to unlock a volume this morning, but can't
> backup or release because it times out during the operation.
>
> What does the FileLog entry mean?
Fileservers register their uuid and ip-addresses in the vldb server at
start time. The then client gets the actual ip-address of a fileserver
he wants to contact from the vldb.
The registration requires a write into the database and can be performed
only on the sync-site. If for some reason you have problems with your
sync site this error message appears in the FileLog.
To your primary problem:
How many database server are you running and with which ip-addresses?
If you do a "bos listhosts" get you the same information from all of them?
Have the database processes been restarted after the last change to the
host list?
-Hartmut
>
> - AB
>
>
--
-----------------------------------------------------------------
Hartmut Reuter e-mail reuter@rzg.mpg.de
phone +49-89-3299-1328
RZG (Rechenzentrum Garching) fax +49-89-3299-1301
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------