[OpenAFS] Setting up new 1.8.2 cell: possible protection server issues

Joseph Timothy Foley foley@ru.is
Sun, 10 Feb 2019 22:40:30 +0000


I gave up and wiped the AFS server installation on the 3 machines and reins=
talled.

Now everything works.
Important note on the Centos7 packages:  You have to fill in all the files =
in /etc/openafs/server or you get the confusing message that it can't write=
 to the cell database there.
I suspect that I got the protection database in a really really confused st=
ate such that Kerberos integration just gave up.

If I get a chance, I will update the=20
https://wiki.openafs.org/admin/InstallingOpenAFSonRHEL/
with the things I have learned in the process.

This is all based upon the jsbillings packages.

Thank you all for helping me understand the process and how to test things.=
  It sped up the reinstall greatly.

Kind regards,
Joe
--
Dr. Joseph T. Foley <foley@ru.is> Assistant Professor,  Reykjavik Universit=
y +354-599-6569


-----Original Message-----
From: Benjamin Kaduk <kaduk@mit.edu>=20
Sent: Sunday, 10 February, 2019 19:26
To: Joseph Timothy Foley <foley@ru.is>
Cc: Jan Iven <jan.iven@cern.ch>; openafs-info@openafs.org
Subject: Re: [OpenAFS] Setting up new 1.8.2 cell: possible protection serve=
r issues

On Sun, Feb 10, 2019 at 01:56:09PM +0000, Joseph Timothy Foley wrote:
> First of all, thank you Jan for your insight.  I had forgotten about the =
"udebug" command.  I was puzzled because the backup logs mention when an el=
ection takes place, but the others don't!
>=20
> I have checked and they are finishing elections and agreeing on a sync si=
te (ipa2) on both 7002 and 7003.
>=20
> This appears to be a kerberos integration problem of some sort.  How do I=
 go about figuring out why both foley and admin are not being identified co=
rrectly? =20
>=20
> This is doubly puzzling  because the Kerberos integration worked fine=20
> when I initially had only one server installed.  The problem arose=20
> when I setup the two other DB servers.  (It also deleted foley and=20
> admin from the protection database, and deleted the root.afs volume!)

Is the name of the AFS cell the same (well, lowercased) as the kerberos rea=
lm in question?  You may need the AFS configuration file krb.conf to specif=
y what realm(s) to use, and/or you may need to have proper domain_realm and=
 default_realm stanzas in the (kerberos) krb5.conf files.

> I rekeyed as best I can, since asetkey gives this error:
> [root@ipa2 foley]# asetkey delete 1
> asetkey: Unknown code acfg 1 (70354689) while deleting key 1 I did not=20
> realize that posting the key IDs would be a serious security hole.  I wil=
l not post them again.
>=20
> There is nothing in the cell at the moment, so I'm happy to wipe and star=
t over if need be.

Probably should; the 'asetkey list' output does appear to be the raw key ma=
terial.  (I thought it was a truncated hash, just to be able to say "is the=
 stuff on machines A and B the same or different?", but the code suggests o=
therwise.)

-Ben

> Joe
> --
> Dr. Joseph T. Foley <foley@ru.is> Assistant Professor,  Reykjavik=20
> University +354-599-6569
>=20
>=20
> -----Original Message-----
> From: openafs-info-admin@openafs.org <openafs-info-admin@openafs.org>=20
> On Behalf Of Jan Iven
> Sent: Sunday, 10 February, 2019 11:18
> To: openafs-info@openafs.org
> Subject: Re: [OpenAFS] Setting up new 1.8.2 cell: possible protection=20
> server issues
>=20
> (top of my head):
> * AFS clients and servers (even on the same machine) use different locati=
ons for CellServDB. It looks like the client file has all 3 hosts (which wo=
uld explain the intermittent results), but the servers might not know about=
 each other, which would mean elections don't work.
>=20
> * suggest to check ptserver status on the servers (should print=20
> version) for i in iap1 ipa2 ipa3; do rxdebug "${i}" 7002; done
>=20
> * check election status for ptserver:
> for i in iap1 ipa2 ipa3; do udebug "${i}" 7002; done
>=20
> The current master will print all participants, and all servers should=20
> agree who is "sync site". If tehre are message about unknown servers,=20
> see "server" CellServDB
>=20
> You also might want to split your problems:
>=20
> * ptserver elections - needs service key. Then just use "-localauth"
> * same for vlserver (port 7003, same thing as above)
> * kerberos integration (i.e identifying "admin" account from Kerberos=20
> ticket -> AFS token)
> * volume releasing - needs service key,  vlserver election to work, admin=
 account to be identified.
>=20
>=20
> PS: please re-key, and do not post the new "asetkey list" again..
> Cheers
> jan
>=20
>=20
> On 10/02/2019 10:51, Joseph Timothy Foley wrote:
> > Hi all
> >=20
> > I've been getting help on the IRC channel setting up a new cell for=20
> > our CS department, but I've hit a roadblock that may need a 1.8.2=20
> > debugging expert. (Many thanks to auristor, billings, and patbarron)
> >=20
> > I have setup 3 Centos7 hosts with IPA:=A0 ipa1.cs.ru.is, ipa2, ipa3.
> >=20
> > IPA2 is the lowest numbered (for historical reasons) and is the=20
> > Kerberos primary.
> >=20
> > The other two are replication sites.
> >=20
> > I have setup the Openafs clients using the yum packages
> >=20
> > I've tried to follow the quickstart and=20
> > https://wiki.openafs.org/admin/InstallingOpenAFSonRHEL/
> >=20
> > To the best of my ability, but I think something is wrong with the=20
> > Protection server.
> >=20
> > I've checked with rxdebug and there is connectivity between the 3=20
> > machines
> >=20
> > I've added both "admin" and "foley" to system:adminstrators and=20
> > using "bos adduser" to all the machines.=A0 "bos listuser" verifies thi=
s.
> >=20
> > Symptom:
> >=20
> > "pts membership admin" as admin works intermittently
> >=20
> > [foley@ipa2 .cs.ru.is]$ pts membership admin
> >=20
> > Groups admin (id: 1) is a member of:
> >=20
> >  =A0 system:administrators
> >=20
> > [foley@ipa2 .cs.ru.is]$ pts membership admin
> >=20
> > pts: Permission denied ; unable to get membership of admin (id: 1)
> >=20
> > But with "-localauth" it always works.
> >=20
> > [foley@ipa2 .cs.ru.is]$ klist -e
> >=20
> > Ticket cache: KEYRING:persistent:1298400006:krb_ccache_qrL87VL
> >=20
> > Default principal: admin@CS.RU.IS
> >=20
> > Valid starting=A0=A0=A0=A0=A0=A0 Expires=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0 Service principal
> >=20
> > 02/10/2019 09:42:12=A0 02/11/2019 09:42:06=A0 afs/cs.ru.is@CS.RU.IS
> >=20
> >  =A0=A0=A0=A0=A0=A0=A0 Etype (skey, tkt): aes256-cts-hmac-sha1-96,
> > aes256-cts-hmac-sha1-96
> >=20
> > 02/10/2019 09:42:10=A0 02/11/2019 09:42:06=A0 krbtgt/CS.RU.IS@CS.RU.IS
> >=20
> >  =A0=A0=A0=A0=A0=A0=A0 Etype (skey, tkt): aes256-cts-hmac-sha1-96,
> > aes256-cts-hmac-sha1-96
> >=20
> > [foley@ipa2 .cs.ru.is]$ bos listusers ipa2
> >=20
> > SUsers are: admin foley
> >=20
> > [foley@ipa2 .cs.ru.is]$ pts examine admin
> >=20
> > pts: Permission denied ; unable to find entry for (id: 1)
> >=20
> > [foley@ipa2 .cs.ru.is]$ pts examine admin
> >=20
> > Name: admin, id: 1, owner: system:administrators, creator:=20
> > system:administrators,
> >=20
> >  =A0 membership: 1, flags: S----, group quota: unlimited.
> >=20
> > Possibly relevant logs:
> >=20
> > On ipa2:=A0 (the lowest ip address) after a restart in=20
> > /var/openafs/logs/PTLog
> >=20
> > Sun Feb 10 09:33:18 2019 Using 130.208.243.201 as my primary address
> >=20
> > Sun Feb 10 09:33:18 2019 Starting AFS ptserver 1.1
> > (/usr/libexec/openafs/ptserver)
> >=20
> > Sun Feb 10 09:33:21 2019 ubik: A Remote Server has addresses:
> >=20
> > Sun Feb 10 09:33:21 2019 ... 130.208.243.202
> >=20
> > Sun Feb 10 09:33:24 2019 ubik: A Remote Server has addresses:
> >=20
> > Sun Feb 10 09:33:24 2019 ... 130.208.243.205
> >=20
> > But no mention of an election.=A0 I only see an election in the BackupL=
og.
> >=20
> > I've tried setting a new key, just in case I got confused.
> >=20
> > [root@ipa2 logs]# =A0asetkey list
> >=20
> > rxkad_krb5=A0=A0=A0=A0=A0 kvno=A0=A0=A0 1 enctype 17; key is:=20
> > 3c54d85bad8dd99f938307e1a4bff2d5
> >=20
> > rxkad_krb5=A0=A0=A0=A0=A0 kvno=A0=A0=A0 1 enctype 18; key is:=20
> > a55c654701f21cd871278f09727ee9c6e7809f05f8eeebdfea9777e94f610ce1
> >=20
> > rxkad_krb5=A0=A0=A0=A0=A0 kvno=A0=A0=A0 2 enctype 17; key is:=20
> > 81f4e3ce6b8179833ad21a8539489a68
> >=20
> > rxkad_krb5=A0=A0=A0=A0=A0 kvno=A0=A0=A0 2 enctype 18; key is:=20
> > b90bbfbb11aa16a2cb0079b66467fa517bdaa4af101ab6ffab400cc6471c827e
> >=20
> > All done.
> >=20
> > (I've checked these on all 3 to make sure they were the same)
> >=20
> > Trying to delete the old key gives an error
> >=20
> > [root@ipa2 logs]# asetkey delete 1
> >=20
> > asetkey: Unknown code acfg 1 (70354689) while deleting key 1
> >=20
> > Symptom 2:
> >=20
> > I can't release a read-only volume with those tickets
> >=20
> > [foley@ipa2 .cs.ru.is]$ vos addsite ipa2 a root.afs
> >=20
> > Could not lock the VLDB entry for the volume 536870915
> >=20
> > VLDB: no permission access for call
> >=20
> > Error in vos addsite command.
> >=20
> > VLDB: no permission access for call
> >=20
> > But -localauth works fine
> >=20
> > [root@ipa2 logs]# vos addsite ipa2 a root.afs -localauth
> >=20
> > Added replication site ipa2 /vicepa for volume root.afs
> >=20
> > Symptom 3:
> >=20
> > Even with all these issues, admin and foley can both create folders=20
> > in the RW volume of the cell!
> >=20
> > System and Package information (all 3 hosts should be identical):
> >=20
> > [foley@ipa2 user]$ uname -a
> >=20
> > Linux ipa2.cs.ru.is 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 29
> > 14:49:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
> >=20
> > Name=A0=A0=A0=A0=A0=A0=A0 : openafs-client
> >=20
> > Arch=A0=A0=A0=A0=A0=A0=A0 : x86_64
> >=20
> > Version=A0=A0=A0=A0 : 1.8.2
> >=20
> > Release=A0=A0=A0=A0 : 1.el7
> >=20
> > Size=A0=A0=A0=A0=A0=A0=A0 : 1.1 M
> >=20
> > Repo=A0=A0=A0=A0=A0=A0=A0 : installed
> >=20
> >  From repo=A0=A0 : storage-sig
> >=20
> > Name=A0=A0=A0=A0=A0=A0=A0 : openafs-server
> >=20
> > Arch=A0=A0=A0=A0=A0=A0=A0 : x86_64
> >=20
> > Version=A0=A0=A0=A0 : 1.8.2
> >=20
> > Release=A0=A0=A0=A0 : 1.el7
> >=20
> > Size=A0=A0=A0 =A0=A0=A0=A0: 9.1 M
> >=20
> > Repo=A0=A0=A0=A0=A0=A0=A0 : installed
> >=20
> >  From repo=A0=A0 : storage-sig
> >=20
> > Any help would be appreciated.
> >=20
> > Kind regards,
> >=20
> > Joe
> >=20
> > --
> >=20
> > Dr. Joseph T. Foley <foley@ru.is> Assistant Professor, Dept. of=20
> > Science & Engineering, Reykjavik University
> >=20
> > Menntavegur 1, Nauth=F3lsv=EDk | 101 Reykjav=EDk | Iceland | Phone:=20
> > +354-599-6569 | Fax +354-599-6201 | www.ru.is
> >=20
>=20
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info