[OpenAFS] How to remove a bogus (127.0.1.1) server entry for readonly volume?

John Tang Boyland boyland@uwm.edu
Mon, 09 Dec 2013 07:37:54 -0600


I'm using Openafs 1.6.5 on Ubuntu 13.04.  Due to an interrupted release
and a fileserver that was not yet NetRestrict'ed from using 127.0.1.1
as one of its IPs, the Volume database is left with a bogus entry:

$ vos examine common.usr.local
common.usr.local                  536873999 RW       5477 K  On-line
    peter.cae.uwm.edu /vicepa 
    RWrite  536873999 ROnly  536874000 Backup          0 
    MaxQuota      50000 K 
    Creation    Tue Apr  1 06:09:31 2008
    Copy        Fri Nov  8 03:20:10 2013
    Backup      Never
    Last Access Mon Dec  2 18:17:25 2013
    Last Update Tue Apr  9 14:17:01 2013
    0 accesses in the past day (i.e., vnode references)

    RWrite: 536873999     ROnly: 536874000     RClone: 536874000 
    number of sites -> 5
       server peter.cae.uwm.edu partition /vicepa RW Site  -- New release
       server solomons.cs.uwm.edu partition /vicepa RO Site  -- New release
       server jeremiah.cs.uwm.edu partition /vicepc RO Site  -- New release
       server 127.0.1.1 partition /vicepa RO Site  -- Old release
       server peter.cae.uwm.edu partition /vicepa RO Site  -- New release

It causes releases to stall until a timeout:

# vos release common.usr.local -localauth
[LONG PAUSE]
Failed to start a transaction on the RO volume.
Possible communication failure
The volume 536873999 could not be released to the following 1 sites:
	                          127.0.1.1 /vicepa
VOLSER: release could not be completed
Error in vos release command.
VOLSER: release could not be completed

I have been unable to remove the offending site:

$ vos remsite 127.0.1.1 a common.usr.local -verbose
This site is not a replication site 
Error in vos remsite command.
VOLSER: illegal operation
$ vos remsite 127.0.1.1 /vicepa common.usr.local -verbose
This site is not a replication site 
Error in vos remsite command.
VOLSER: illegal operation
$ vos remsite 127.0.1.1 a common.usr.local.readonly -verbose
This site is not a replication site 
Error in vos remsite command.
VOLSER: illegal operation

$ vos delentry -prefix common.usr.local -server 127.0.1.1 -partition a
-verbose
Deleting VLDB entries for server 127.0.1.1 partition /vicepa which are
prefixed with common.usr.local 
----------------------
Total VLDB entries deleted: 0; failed to delete: 0

$ vos remove 127.0.1.1 a common.usr.local.readonly
[LONG PAUSE]
Could not fetch the list of partitions from the server
Possible communication failure
Possible communication failure

$ vos listaddrs
peter.cae.uwm.edu
solomons.cs.uwm.edu
jeremiah.cs.uwm.edu

I'm really surprised that "vos remsite" doesn't work.
What am I doing wrong?

Best regards,
John