[OpenAFS] Setting up new 1.8.2 cell: possible protection server issues

Jan Iven jan.iven@cern.ch
Sun, 10 Feb 2019 11:17:39 +0100


(top of my head):
* AFS clients and servers (even on the same machine) use different 
locations for CellServDB. It looks like the client file has all 3 hosts 
(which would explain the intermittent results), but the servers might 
not know about each other, which would mean elections don't work.

* suggest to check ptserver status on the servers (should print version)
for i in iap1 ipa2 ipa3; do rxdebug "${i}" 7002; done

* check election status for ptserver:
for i in iap1 ipa2 ipa3; do udebug "${i}" 7002; done

The current master will print all participants, and all servers should 
agree who is "sync site". If tehre are message about unknown servers, 
see "server" CellServDB

You also might want to split your problems:

* ptserver elections - needs service key. Then just use "-localauth"
* same for vlserver (port 7003, same thing as above)
* kerberos integration (i.e identifying "admin" account from Kerberos 
ticket -> AFS token)
* volume releasing - needs service key,  vlserver election to work, 
admin account to be identified.


PS: please re-key, and do not post the new "asetkey list" again..
Cheers
jan


On 10/02/2019 10:51, Joseph Timothy Foley wrote:
> Hi all
> 
> I’ve been getting help on the IRC channel setting up a new cell for our 
> CS department, but I’ve hit a roadblock that may need a 1.8.2 debugging 
> expert. (Many thanks to auristor, billings, and patbarron)
> 
> I have setup 3 Centos7 hosts with IPA:  ipa1.cs.ru.is, ipa2, ipa3.
> 
> IPA2 is the lowest numbered (for historical reasons) and is the Kerberos 
> primary.
> 
> The other two are replication sites.
> 
> I have setup the Openafs clients using the yum packages
> 
> I’ve tried to follow the quickstart and 
> https://wiki.openafs.org/admin/InstallingOpenAFSonRHEL/
> 
> To the best of my ability, but I think something is wrong with the 
> Protection server.
> 
> I’ve checked with rxdebug and there is connectivity between the 3 machines
> 
> I’ve added both “admin” and “foley” to system:adminstrators and using 
> “bos adduser” to all the machines.  “bos listuser” verifies this.
> 
> Symptom:
> 
> “pts membership admin” as admin works intermittently
> 
> [foley@ipa2 .cs.ru.is]$ pts membership admin
> 
> Groups admin (id: 1) is a member of:
> 
>    system:administrators
> 
> [foley@ipa2 .cs.ru.is]$ pts membership admin
> 
> pts: Permission denied ; unable to get membership of admin (id: 1)
> 
> But with “-localauth” it always works.
> 
> [foley@ipa2 .cs.ru.is]$ klist -e
> 
> Ticket cache: KEYRING:persistent:1298400006:krb_ccache_qrL87VL
> 
> Default principal: admin@CS.RU.IS
> 
> Valid starting       Expires              Service principal
> 
> 02/10/2019 09:42:12  02/11/2019 09:42:06  afs/cs.ru.is@CS.RU.IS
> 
>          Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
> 
> 02/10/2019 09:42:10  02/11/2019 09:42:06  krbtgt/CS.RU.IS@CS.RU.IS
> 
>          Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
> 
> [foley@ipa2 .cs.ru.is]$ bos listusers ipa2
> 
> SUsers are: admin foley
> 
> [foley@ipa2 .cs.ru.is]$ pts examine admin
> 
> pts: Permission denied ; unable to find entry for (id: 1)
> 
> [foley@ipa2 .cs.ru.is]$ pts examine admin
> 
> Name: admin, id: 1, owner: system:administrators, creator: 
> system:administrators,
> 
>    membership: 1, flags: S----, group quota: unlimited.
> 
> Possibly relevant logs:
> 
> On ipa2:  (the lowest ip address) after a restart in /var/openafs/logs/PTLog
> 
> Sun Feb 10 09:33:18 2019 Using 130.208.243.201 as my primary address
> 
> Sun Feb 10 09:33:18 2019 Starting AFS ptserver 1.1 
> (/usr/libexec/openafs/ptserver)
> 
> Sun Feb 10 09:33:21 2019 ubik: A Remote Server has addresses:
> 
> Sun Feb 10 09:33:21 2019 ... 130.208.243.202
> 
> Sun Feb 10 09:33:24 2019 ubik: A Remote Server has addresses:
> 
> Sun Feb 10 09:33:24 2019 ... 130.208.243.205
> 
> But no mention of an election.  I only see an election in the BackupLog.
> 
> I’ve tried setting a new key, just in case I got confused.
> 
> [root@ipa2 logs]#  asetkey list
> 
> rxkad_krb5      kvno    1 enctype 17; key is: 
> 3c54d85bad8dd99f938307e1a4bff2d5
> 
> rxkad_krb5      kvno    1 enctype 18; key is: 
> a55c654701f21cd871278f09727ee9c6e7809f05f8eeebdfea9777e94f610ce1
> 
> rxkad_krb5      kvno    2 enctype 17; key is: 
> 81f4e3ce6b8179833ad21a8539489a68
> 
> rxkad_krb5      kvno    2 enctype 18; key is: 
> b90bbfbb11aa16a2cb0079b66467fa517bdaa4af101ab6ffab400cc6471c827e
> 
> All done.
> 
> (I’ve checked these on all 3 to make sure they were the same)
> 
> Trying to delete the old key gives an error
> 
> [root@ipa2 logs]# asetkey delete 1
> 
> asetkey: Unknown code acfg 1 (70354689) while deleting key 1
> 
> Symptom 2:
> 
> I can’t release a read-only volume with those tickets
> 
> [foley@ipa2 .cs.ru.is]$ vos addsite ipa2 a root.afs
> 
> Could not lock the VLDB entry for the volume 536870915
> 
> VLDB: no permission access for call
> 
> Error in vos addsite command.
> 
> VLDB: no permission access for call
> 
> But –localauth works fine
> 
> [root@ipa2 logs]# vos addsite ipa2 a root.afs -localauth
> 
> Added replication site ipa2 /vicepa for volume root.afs
> 
> Symptom 3:
> 
> Even with all these issues, admin and foley can both create folders in 
> the RW volume of the cell!
> 
> System and Package information (all 3 hosts should be identical):
> 
> [foley@ipa2 user]$ uname -a
> 
> Linux ipa2.cs.ru.is 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 29 14:49:43 
> UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
> 
> Name        : openafs-client
> 
> Arch        : x86_64
> 
> Version     : 1.8.2
> 
> Release     : 1.el7
> 
> Size        : 1.1 M
> 
> Repo        : installed
> 
>  From repo   : storage-sig
> 
> Name        : openafs-server
> 
> Arch        : x86_64
> 
> Version     : 1.8.2
> 
> Release     : 1.el7
> 
> Size        : 9.1 M
> 
> Repo        : installed
> 
>  From repo   : storage-sig
> 
> Any help would be appreciated.
> 
> Kind regards,
> 
> Joe
> 
> --
> 
> Dr. Joseph T. Foley <foley@ru.is> Assistant Professor, Dept. of Science 
> & Engineering, Reykjavik University
> 
> Menntavegur 1, Nauthólsvík | 101 Reykjavík | Iceland | Phone: 
> +354-599-6569 | Fax +354-599-6201 | www.ru.is
>