From svamberg@gmail.com Thu Mar 1 14:41:04 2018 From: svamberg@gmail.com (=?UTF-8?Q?Michal_=C5=A0vamberg?=) Date: Thu, 1 Mar 2018 15:41:04 +0100 Subject: [OpenAFS] Invalid AFSFetchStatus - inaccesible data Message-ID: --089e08235a1079845f05665ad99e Content-Type: text/plain; charset="UTF-8" Hi, in volume are inaccesible files and client wrote message: [ 56.306458] afs: FetchStatus ec 0 iv 1 ft 0 pv 947 pu 17570 [ 56.306461] afs: Invalid AFSFetchStatus from server 147.228.54.17 [ 56.306463] afs: This suggests the server may be sending bad data that can lead to availability issues or data corruption. The issue has been avoided for now, but it may not always be detectable. Please upgrade the server if possible. [ 56.306469] afs: Waiting for busy volume 875764977 (user.wimmer) in cell zcu.cz I try bos salvage, vos move, vos dump & restore, but nothing help to me. Without token I get list of files in the volume: LANG=C ls /afs/. zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI/c-client ls: cannot access '/afs/. zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI/c-client/fsync.c': Permission denied <...output is ommited...> ls: cannot access '/afs/. zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI/c-client/drivernt.bat': Permission denied ls: cannot access '/afs/. zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI/c-client/c-client.def': Permission denied c-client.def fsync.c log_a41.c os_a41.c os_osf.c os_sos.c writev.c drivernt.bat gethstid.c log_sco.c os_a41.h os_slx.h write.c writevs.c With token waiting too long (minutes) for first error line: /afs/.zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI/c-client# ls ls: cannot access 'fsync.c': Resource temporarily unavailable I tried openafs client and server with version 1.6.20 and 1.8.0~pre4 from debian. Thanks for help. Michal Svamberg --089e08235a1079845f05665ad99e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,
in volume are inacce= sible files and client wrote message:
[=C2=A0=C2=A0 56.306458] afs: Fetc= hStatus ec 0 iv 1 ft 0 pv 947 pu 17570
[=C2=A0=C2=A0 56.306461] afs: Inv= alid AFSFetchStatus from server 147.228.54.17
[=C2=A0=C2=A0 56.306463] a= fs: This suggests the server may be sending bad data that can lead to avail= ability issues or data corruption. The issue has been avoided for now, but = it may not always be detectable. Please upgrade the server if possible.
= [=C2=A0=C2=A0 56.306469] afs: Waiting for busy volume 875764977 (user.wimme= r) in cell zcu.cz


I try bos = salvage, vos move, vos dump & restore, but nothing help to me.

<= br>
Without token I get list of files in the volume:
LANG=3DC= ls /afs/.zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI/c-cl= ient
ls: cannot access '/afs/.zcu.cz/users/w/wimm= er/home/app/ult/pine3.95/imap/ANSI/c-client/fsync.c': Permission de= nied
<...output is ommited...>
ls: cannot access = 9;/afs/.zcu.cz/users/w/wimmer/home/app/ult/pine3.95/ima= p/ANSI/c-client/drivernt.bat': Permission denied
ls: cannot acce= ss '/afs/.zcu.cz/users/w/wimmer/home/app/ult/pine3.= 95/imap/ANSI/c-client/c-client.def': Permission denied
c-client.= def=C2=A0 fsync.c=C2=A0=C2=A0=C2=A0=C2=A0 log_a41.c=C2=A0 os_a41.c=C2=A0 os= _osf.c=C2=A0 os_sos.c=C2=A0 writev.c
drivernt.bat=C2=A0 gethstid.c=C2=A0= log_sco.c=C2=A0 os_a41.h=C2=A0 os_slx.h=C2=A0 write.c=C2=A0=C2=A0 writevs.= c

With token waiting too long (minutes) for first error l= ine:
/afs/.zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI= /c-client# ls
ls: cannot access 'fsync.c': Resource temporar= ily unavailable

I tried openafs client and server with ve= rsion 1.6.20 and 1.8.0~pre4 from debian.

Thanks for help.=
Michal Svamberg
--089e08235a1079845f05665ad99e-- From svamberg@gmail.com Thu Mar 1 15:28:43 2018 From: svamberg@gmail.com (=?UTF-8?Q?Michal_=C5=A0vamberg?=) Date: Thu, 1 Mar 2018 16:28:43 +0100 Subject: [OpenAFS] Re: Invalid AFSFetchStatus - inaccesible data In-Reply-To: References: Message-ID: --001a1145a70cea410705665b8351 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable There is small update - time of 'ls -l' with AFS token: 25 minutes # LANG=3DC time ls -l /afs/. zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI/c-client ls: cannot access '/afs/. zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI/c-client/fsync.c': Resource temporarily unavailable ls: cannot access '/afs/. zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI/c-client/drivernt.bat= ': Resource temporarily unavailable total 31 -rw------- 1 1004 users 1716 May 23 1996 c-client.def ?????????? ? ? ? ? ? drivernt.bat ?????????? ? ? ? ? ? fsync.c -rw------- 1 1004 users 1712 May 4 1995 gethstid.c -rw------- 1 1004 users 2493 Jun 19 1996 log_a41.c -rw------- 1 1004 users 2710 May 16 1995 log_sco.c -rw------- 1 1004 users 2314 Feb 7 1996 os_a41.c -rw------- 1 1004 users 1959 Feb 7 1996 os_a41.h -rw------- 1 1004 users 2080 Feb 1 1995 os_osf.c -rw------- 1 1004 users 1874 May 7 1996 os_slx.h -rw------- 1 1004 users 2134 Sep 1 1995 os_sos.c -rw------- 1 1004 users 2854 Jun 29 1995 write.c -rw------- 1 1004 users 2615 Jul 7 1995 writev.c -rw------- 1 1004 users 1948 Jun 29 1995 writevs.c Command exited with non-zero status 1 0.00user 0.00system 25:10.44elapsed 0%CPU (0avgtext+0avgdata 2588maxresident)k 0inputs+0outputs (0major+140minor)pagefaults 0swaps How to solve this? Michal Svamberg 2018-03-01 15:41 GMT+01:00 Michal =C5=A0vamberg : > Hi, > in volume are inaccesible files and client wrote message: > [ 56.306458] afs: FetchStatus ec 0 iv 1 ft 0 pv 947 pu 17570 > [ 56.306461] afs: Invalid AFSFetchStatus from server 147.228.54.17 > [ 56.306463] afs: This suggests the server may be sending bad data that > can lead to availability issues or data corruption. The issue has been > avoided for now, but it may not always be detectable. Please upgrade the > server if possible. > [ 56.306469] afs: Waiting for busy volume 875764977 (user.wimmer) in > cell zcu.cz > > > I try bos salvage, vos move, vos dump & restore, but nothing help to me. > > > Without token I get list of files in the volume: > LANG=3DC ls /afs/.zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ > ANSI/c-client > ls: cannot access '/afs/.zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap= / > ANSI/c-client/fsync.c': Permission denied > <...output is ommited...> > ls: cannot access '/afs/.zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap= / > ANSI/c-client/drivernt.bat': Permission denied > ls: cannot access '/afs/.zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap= / > ANSI/c-client/c-client.def': Permission denied > c-client.def fsync.c log_a41.c os_a41.c os_osf.c os_sos.c writev= .c > drivernt.bat gethstid.c log_sco.c os_a41.h os_slx.h write.c > writevs.c > > With token waiting too long (minutes) for first error line: > /afs/.zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI/c-client# ls > ls: cannot access 'fsync.c': Resource temporarily unavailable > > I tried openafs client and server with version 1.6.20 and 1.8.0~pre4 from > debian. > > Thanks for help. > Michal Svamberg > --001a1145a70cea410705665b8351 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
There is small update - time of 'ls -l' with = AFS token: 25 minutes

# LANG=3DC time ls -l /afs/.= zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI/c-client
ls: c= annot access '/afs/.zcu.cz/users/w/wimmer/home/app/ult/p= ine3.95/imap/ANSI/c-client/fsync.c': Resource temporarily unavailab= le
ls: cannot access '/afs/.zcu.cz/users/w/wimme= r/home/app/ult/pine3.95/imap/ANSI/c-client/drivernt.bat': Resource = temporarily unavailable
total 31
-rw------- 1 1004 users 1716 May 23= =C2=A0 1996 c-client.def
?????????? ? ?=C2=A0=C2=A0=C2=A0 ?=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ?=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 ? drivernt.bat
?????????? ? ?=C2=A0=C2=A0=C2=A0= ?=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ?=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ? fsync.c
-rw------- 1 1004 user= s 1712 May=C2=A0 4=C2=A0 1995 gethstid.c
-rw------- 1 1004 users 2493 Ju= n 19=C2=A0 1996 log_a41.c
-rw------- 1 1004 users 2710 May 16=C2=A0 1995= log_sco.c
-rw------- 1 1004 users 2314 Feb=C2=A0 7=C2=A0 1996 os_a41.c<= br>-rw------- 1 1004 users 1959 Feb=C2=A0 7=C2=A0 1996 os_a41.h
-rw-----= -- 1 1004 users 2080 Feb=C2=A0 1=C2=A0 1995 os_osf.c
-rw------- 1 1004 u= sers 1874 May=C2=A0 7=C2=A0 1996 os_slx.h
-rw------- 1 1004 users 2134 S= ep=C2=A0 1=C2=A0 1995 os_sos.c
-rw------- 1 1004 users 2854 Jun 29=C2=A0= 1995 write.c
-rw------- 1 1004 users 2615 Jul=C2=A0 7=C2=A0 1995 writev= .c
-rw------- 1 1004 users 1948 Jun 29=C2=A0 1995 writevs.c
Command e= xited with non-zero status 1
0.00user 0.00system 25:10.44elapsed 0%CPU (= 0avgtext+0avgdata 2588maxresident)k
0inputs+0outputs (0major+140minor)pa= gefaults 0swaps

How to solve this?

Mich= al Svamberg


2018-03-01 15:41 GMT+01:00 Michal =C5=A0vamberg <svamberg@gmail.com>:
Hi,
in volume are inaccesible= files and client wrote message:
[=C2=A0=C2=A0 56.306458] afs: FetchStat= us ec 0 iv 1 ft 0 pv 947 pu 17570
[=C2=A0=C2=A0 56.306461] afs: Invalid = AFSFetchStatus from server 147.228.54.17
[=C2=A0=C2=A0 56.306463] afs: T= his suggests the server may be sending bad data that can lead to availabili= ty issues or data corruption. The issue has been avoided for now, but it ma= y not always be detectable. Please upgrade the server if possible.
[=C2= =A0=C2=A0 56.306469] afs: Waiting for busy volume 875764977 (user.wimmer) i= n cell zcu.cz

=
I try bos salvage, vos move, vos dump & restore, but nothing help t= o me.


Without token I get list of files in the volume:
=
LANG=3DC ls /afs/.zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI/c-client
ls: cannot access = 9;/afs/.zcu.cz/users/w/wimmer/home/ap= p/ult/pine3.95/imap/ANSI/c-client/fsync.c': Permission denied<= br>
<...output is ommited...>
ls: cannot access '/af= s/.zcu.cz/users/w/wimmer/home/ap= p/ult/pine3.95/imap/ANSI/c-client/drivernt.bat': Permission de= nied
ls: cannot access '/afs/.= zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI/c-client/c-= client.def': Permission denied
c-client.def=C2=A0 fsync.c=C2=A0= =C2=A0=C2=A0=C2=A0 log_a41.c=C2=A0 os_a41.c=C2=A0 os_osf.c=C2=A0 os_sos.c= =C2=A0 writev.c
drivernt.bat=C2=A0 gethstid.c=C2=A0 log_sco.c=C2=A0 os_a= 41.h=C2=A0 os_slx.h=C2=A0 write.c=C2=A0=C2=A0 writevs.c

W= ith token waiting too long (minutes) for first error line:
/afs/.zcu.cz/users/w/wimmer/home/app/ult/pine3.95/imap/ANSI/c-client# ls
ls: cannot access 'fsync.c': Resource t= emporarily unavailable

I tried openafs client and server = with version 1.6.20 and 1.8.0~pre4 from debian.

Thanks fo= r help.
Michal Svamberg

--001a1145a70cea410705665b8351-- From jaltman@auristor.com Thu Mar 1 16:06:15 2018 From: jaltman@auristor.com (Jeffrey Altman) Date: Thu, 1 Mar 2018 11:06:15 -0500 Subject: [OpenAFS] Invalid AFSFetchStatus - inaccesible data In-Reply-To: References: Message-ID: This is a cryptographically signed message in MIME format. --------------ms040506020105010203090406 Content-Type: multipart/mixed; boundary="------------42CDCC9DFF78313390F49C80" Content-Language: en-US This is a multi-part message in MIME format. --------------42CDCC9DFF78313390F49C80 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 3/1/2018 9:41 AM, Michal =C5=A0vamberg wrote: > Hi, > in volume are inaccesible files and client wrote message: > [=C2=A0=C2=A0 56.306458] afs: FetchStatus ec 0 iv 1 ft 0 pv 947 pu 1757= 0 > [=C2=A0=C2=A0 56.306461] afs: Invalid AFSFetchStatus from server 147.22= 8.54.17 > [=C2=A0=C2=A0 56.306463] afs: This suggests the server may be sending b= ad data > that can lead to availability issues or data corruption. The issue has > been avoided for now, but it may not always be detectable. Please > upgrade the server if possible. > [=C2=A0=C2=A0 56.306469] afs: Waiting for busy volume 875764977 (user.w= immer) in > cell zcu.cz As recorded on disk, the file type for 875764977.947.17570 is 0 (invalid). As such, it is neither a directory, a file, a symlink nor a mount point and cannot be processed by the client. > I try bos salvage, vos move, vos dump & restore, but nothing help to me= =2E The salvager doesn't know how to fix vnode's whose on-disk meta data contains an invalid vnode type. Moving, dumping and restoring the volume will simply move, dump and restore the vnode with the invalid type value. These are the available options: 1. restore the volume from a backup prior to the introduction of the on-disk damage. If vnode 875764977.947.17570 is damaged then it is possible that other vnodes are as well. 2. edit the vnode metadata stored in the vice partition 3. delete the damaged vnode by removing its directory entry and restoring the file data from backup or other sources 4. delete the vnode in the vice partition and salvage to cleanup the directory The warning message from the client is misleading in that the fileserver is not generating bogus information but the data on-disk is already bogus= =2E Jeffrey Altman AuriStor, Inc. --------------42CDCC9DFF78313390F49C80 Content-Type: text/x-vcard; charset=utf-8; name="jaltman.vcf" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="jaltman.vcf" begin:vcard fn:Jeffrey Altman n:Altman;Jeffrey org:AuriStor, Inc. adr:Suite 6B;;255 West 94Th Street;New York;New York;10025-6985;United St= ates email;internet:jaltman@auristor.com title:Founder and CEO tel;work:+1-212-769-9018 note;quoted-printable:LinkedIn: https://www.linkedin.com/in/jeffreyaltman= =3D0D=3D0A=3D Skype: jeffrey.e.altman=3D0D=3D0A=3D =09 url:https://www.auristor.com/ version:2.1 end:vcard --------------42CDCC9DFF78313390F49C80-- --------------ms040506020105010203090406 Content-Type: application/pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwEAAKCC DIIwggXpMIIE0aADAgECAhBAAV7gPRitcrlGsJTzkwjvMA0GCSqGSIb3DQEBCwUAMDoxCzAJ BgNVBAYTAlVTMRIwEAYDVQQKEwlJZGVuVHJ1c3QxFzAVBgNVBAMTDlRydXN0SUQgQ0EgQTEy MB4XDTE3MTAwMzAzMTczM1oXDTE4MTEwMzAzMTczM1owgYUxLTArBgNVBAsMJFZlcmlmaWVk IEVtYWlsOiBqYWx0bWFuQGF1cmlzdG9yLmNvbTEjMCEGCSqGSIb3DQEJARYUamFsdG1hbkBh dXJpc3Rvci5jb20xLzAtBgoJkiaJk/IsZAEBEx9BMDE0MjdFMDAwMDAxNUVFMDNEMTg3QTAw MDA0QUE1MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAqqJC89ZA1DSS7t/Ug8Dd BQv5nBDumInWtFvHwVCORitVCvlkX4SfqKpERATq0eHOSc0zEz1PUjhAT8lgbNj8Bs92pL9t DW/VHHpq11w06rCEmZJNxgErAIvMpRuAhGrzvBpQBLj8nDArHWw+5nRn/KnK7ZO81LEEj4TG w0PEKGSa0aFA+JdRTJ6BZSDP2o/8AHx+Bw4JgW8VppAe4IuY/F+JoYtyQDL+fm1YMnFMtf1A 6IvlGXD7gMksPRbVIfD+QpHZbQvNXZAVVDaCWZuWQq46Vl4lSlkmW9yMlGddvFGl2zSMK7ny f0kbWJLw9lZxXDegY0/ciJPACPsyBwuyLwIDAQABo4ICnTCCApkwDgYDVR0PAQH/BAQDAgWg MIGEBggrBgEFBQcBAQR4MHYwMAYIKwYBBQUHMAGGJGh0dHA6Ly9jb21tZXJjaWFsLm9jc3Au aWRlbnRydXN0LmNvbTBCBggrBgEFBQcwAoY2aHR0cDovL3ZhbGlkYXRpb24uaWRlbnRydXN0 LmNvbS9jZXJ0cy90cnVzdGlkY2FhMTIucDdjMB8GA1UdIwQYMBaAFKRz2u9pNYp1zKAZewgy +GuJ5ELsMAkGA1UdEwQCMAAwggEsBgNVHSAEggEjMIIBHzCCARsGC2CGSAGG+S8ABgsBMIIB CjBKBggrBgEFBQcCARY+aHR0cHM6Ly9zZWN1cmUuaWRlbnRydXN0LmNvbS9jZXJ0aWZpY2F0 ZXMvcG9saWN5L3RzL2luZGV4Lmh0bWwwgbsGCCsGAQUFBwICMIGuGoGrVGhpcyBUcnVzdElE IENlcnRpZmljYXRlIGhhcyBiZWVuIGlzc3VlZCBpbiBhY2NvcmRhbmNlIHdpdGggCklkZW5U cnVzdCdzIFRydXN0SUQgQ2VydGlmaWNhdGUgUG9saWN5IGZvdW5kIGF0IGh0dHBzOi8vc2Vj dXJlLmlkZW50cnVzdC5jb20vY2VydGlmaWNhdGVzL3BvbGljeS90cy9pbmRleC5odG1sMEUG A1UdHwQ+MDwwOqA4oDaGNGh0dHA6Ly92YWxpZGF0aW9uLmlkZW50cnVzdC5jb20vY3JsL3Ry dXN0aWRjYWExMi5jcmwwHwYDVR0RBBgwFoEUamFsdG1hbkBhdXJpc3Rvci5jb20wHQYDVR0O BBYEFNefZrPaqPUvaS6V6kAmHDwFhoDiMB0GA1UdJQQWMBQGCCsGAQUFBwMCBggrBgEFBQcD BDANBgkqhkiG9w0BAQsFAAOCAQEAKlssrfOJ5+WwHyhFSeSsioN0qpg2QDX/uvodF38JbquO 1U0my0j3Cc/bwk48++bjzp0Fvk/Kkcmss5/6zzJMjr9rf12QCQfKkbO9nMm8Bg6IP3pYgk0W /F1h3ZQF3OgBn3zZoOd3f1a6dF6z12MqKA/2g5GKrQFxkdzTGrNw6ISE9uY8ysvc3i2N2kas HNi5Etk7StZ1jvFX5sQMIeNdlF+z+BU/AyT7NoBS4gCH+ggF+DG7fAYywvy42Lfu8p6kopKT 5JZpYce1cNjnOaDhzhgeR+oXxoDbekF27JinXHQSKjBxhujcZu5leAkpctFpZxnIKZJZUBiu 31Nm7xYaijCCBpEwggR5oAMCAQICEQD53lZ/yU0Md3D5YBtS2hU7MA0GCSqGSIb3DQEBCwUA MEoxCzAJBgNVBAYTAlVTMRIwEAYDVQQKEwlJZGVuVHJ1c3QxJzAlBgNVBAMTHklkZW5UcnVz dCBDb21tZXJjaWFsIFJvb3QgQ0EgMTAeFw0xNTAyMTgyMjI1MTlaFw0yMzAyMTgyMjI1MTla MDoxCzAJBgNVBAYTAlVTMRIwEAYDVQQKEwlJZGVuVHJ1c3QxFzAVBgNVBAMTDlRydXN0SUQg Q0EgQTEyMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA0ZFNPM8KJzSSrkvpmtQl a3ksT+fq1s9c+Ea3YSC/umUkygSm9UkkOoaoNjKZoCx3wef1kwC4pQQV2XHk+AKR+7uMvnOC Iw2cAVUP0/Kuy4X6miqaXGGVDTqwVjaFuFCRVVDTQoI2BTMpwFQi+O/TjD5+E0+TAZbkzsB7 krk4YUbA6hFyT0YboxRUq9M2QHDb+80w53b1UZVO1HS2Mfk9LnINeyzjxiXU/iENK07YvjBO xbY/ftAYPbv/9cY3wrpqZYHoXZc6B9/8+aVCNA45FP3k+YuTDC+ZrmePQBLQJWnyS/QrZEdX saieWUqkUMxPQKTExArCiP61YRYlOIMpKwIDAQABo4ICgDCCAnwwgYkGCCsGAQUFBwEBBH0w ezAwBggrBgEFBQcwAYYkaHR0cDovL2NvbW1lcmNpYWwub2NzcC5pZGVudHJ1c3QuY29tMEcG CCsGAQUFBzAChjtodHRwOi8vdmFsaWRhdGlvbi5pZGVudHJ1c3QuY29tL3Jvb3RzL2NvbW1l cmNpYWxyb290Y2ExLnA3YzAfBgNVHSMEGDAWgBTtRBnA0/AGi+6ke75C5yZUyI42djAPBgNV HRMBAf8EBTADAQH/MIIBIAYDVR0gBIIBFzCCARMwggEPBgRVHSAAMIIBBTCCAQEGCCsGAQUF BwICMIH0MEUWPmh0dHBzOi8vc2VjdXJlLmlkZW50cnVzdC5jb20vY2VydGlmaWNhdGVzL3Bv bGljeS90cy9pbmRleC5odG1sMAMCAQEagapUaGlzIFRydXN0SUQgQ2VydGlmaWNhdGUgaGFz IGJlZW4gaXNzdWVkIGluIGFjY29yZGFuY2Ugd2l0aCBJZGVuVHJ1c3QncyBUcnVzdElEIENl cnRpZmljYXRlIFBvbGljeSBmb3VuZCBhdCBodHRwczovL3NlY3VyZS5pZGVudHJ1c3QuY29t L2NlcnRpZmljYXRlcy9wb2xpY3kvdHMvaW5kZXguaHRtbDBKBgNVHR8EQzBBMD+gPaA7hjlo dHRwOi8vdmFsaWRhdGlvbi5pZGVudHJ1c3QuY29tL2NybC9jb21tZXJjaWFscm9vdGNhMS5j cmwwHQYDVR0lBBYwFAYIKwYBBQUHAwIGCCsGAQUFBwMEMA4GA1UdDwEB/wQEAwIBhjAdBgNV HQ4EFgQUpHPa72k1inXMoBl7CDL4a4nkQuwwDQYJKoZIhvcNAQELBQADggIBAA3hgq7S+/Tr Yxl+D7ExI1Rdgq8fC9kiT7ofWlSaK/IMjgjoDfBbPGWvzdkmbSgYgXo8GxuAon9+HLIjNv68 BgUmbIjwj/SYaVz6chA25XZdjxzKk+hUkqCmfOn/twQJeRfxHg3I+0Sfwp5xs10YF0Robhrs CRne6OUmh9mph0fE3b21k90OVnx9Hfr+YAV4ISrTA6045zQTKGzb370whliPLFo+hNL6XzEt y5hfdFaWKtHIfpE994CLmTJI4SEbWq40d7TpAjCmKCPIVPq/+9GqggGvtakM5K3VXNc9VtKP U9xYGCTDIYoeVBQ65JsdsdyM4PzDzAdINsv4vaF7yE03nh2jLV7XAkcqad9vS4EB4hKjFFsm cwxa+ACUfkVWtBaWBqN4f/o1thsFJHEAu4Q6oRB6mYkzqrPigPazF2rgYw3lp0B1gSzCRj+j RtErIVdMPeZ2p5Fdx7SNhBtabuhqmpJkFxwW9SBg6sHvy0HpzVvEiBpApFKG1ZHXMwzQl+pR 8P27wWDsblJU7Qgb8ZzGRK9l5GOFhxtN+oXZ4CCmunLMtaZ2vSai7du/VKrg64GGZNAKerEB evjJVNFgeSnmUK9GB4kCZ7U5NWlU+2H87scntW4Q/0Y6vqQJcJeaMHg/dQnahTQ2p+hB1xJJ K32GWIAucTFMSOKLbQHadIOiMYIDFDCCAxACAQEwTjA6MQswCQYDVQQGEwJVUzESMBAGA1UE ChMJSWRlblRydXN0MRcwFQYDVQQDEw5UcnVzdElEIENBIEExMgIQQAFe4D0YrXK5RrCU85MI 7zANBglghkgBZQMEAgEFAKCCAZcwGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG 9w0BCQUxDxcNMTgwMzAxMTYwNjE1WjAvBgkqhkiG9w0BCQQxIgQgJxucLA/ec3glghPL/5+r i55lmQqJOAw7PUn+dfAbySIwXQYJKwYBBAGCNxAEMVAwTjA6MQswCQYDVQQGEwJVUzESMBAG A1UEChMJSWRlblRydXN0MRcwFQYDVQQDEw5UcnVzdElEIENBIEExMgIQQAFe4D0YrXK5RrCU 85MI7zBfBgsqhkiG9w0BCRACCzFQoE4wOjELMAkGA1UEBhMCVVMxEjAQBgNVBAoTCUlkZW5U cnVzdDEXMBUGA1UEAxMOVHJ1c3RJRCBDQSBBMTICEEABXuA9GK1yuUawlPOTCO8wbAYJKoZI hvcNAQkPMV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3DQMHMA4GCCqG SIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDANBgkq hkiG9w0BAQEFAASCAQCN6mDBKjrsxIeZd64qd/kilGNkW0D8QGqeubMURQCXso7ZON4OQuIn 8LnWNb2Mxq9H4oa+ExTmOK6v0dv10/LofVk5zK8M8iOVH+YDWSiFydl61/yAKM2hepud6tKt q7QeIQYsubQB2aijj2w7PA3laeQVwtI1zPlQCCZwQJSUqD6fgwbv5SIvriESfpj+SOd6bx2B DI398rNjyTxpchbdnACYoyFL+rVGzPUJlBbPtbfqNujjK4h5ZoQuhW2xCSbajSBI6u7Dzkx9 ezooK8mOruvOltvXH1C+OF52Jb+bo/2DodXgZ6tgeFA8TuYM4pGVRcSqQ44yfV4ZhnjCupoh AAAAAAAA --------------ms040506020105010203090406-- From anders.j.nordin@ltu.se Fri Mar 2 08:47:26 2018 From: anders.j.nordin@ltu.se (Anders Nordin) Date: Fri, 2 Mar 2018 08:47:26 +0000 Subject: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up In-Reply-To: <20180209000156.GM12363@mit.edu> References: <54924524-154a-bee0-1719-77f8af636f63@auristor.com> <20180209000156.GM12363@mit.edu> Message-ID: Hello, Is there any progress on this issue? Can we expect a stable release for RHE= L 7.5? MVH Anders -----Original Message----- From: openafs-info-admin@openafs.org [mailto:openafs-info-admin@openafs.org= ] On Behalf Of Benjamin Kaduk Sent: den 9 februari 2018 01:02 To: Kodiak Firesmith Cc: openafs-info Subject: Re: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up On Wed, Feb 07, 2018 at 11:46:28AM -0500, Kodiak Firesmith wrote: > Hello again All, >=20 > As part of continued testing, I've been able to confirm that the=20 > SystemD double-service startup thing only happens to my hosts when=20 > going from RHEL > 7.4 to RHEL 7.5beta. On a test host installed directly as RHEL=20 > 7.5beta, I get a bit farther with 1.6.18.22, in that I get to the=20 > point where OpenAFS "kind of" works. Thanks for tracking this down. The rpm packaging maintainers may want to t= ry to track down why the double-start happens in the upgrade scenario, as t= hat's pretty nasty behavior. > What I'm observing is that the openafs client Kernel module (built by=20 > DKMS) loads fine, and just so long as you know where you need to go in=20 > /afs, you can get there, and you can read and write files and the OpenAFS= 'fs' > command works. But doing an 'ls' of /afs or any path underneath=20 > results in > "ls: reading directory /afs/: Not a directory". >=20 > I ran an strace of a good RHEL 7.4 host running ls on /afs, and a RHEL=20 > 7.5beta host running ls on /afs and have created pastebins of both, as=20 > well as an inline diff. >=20 > All can be seen at the following locations: >=20 > works > https://paste.fedoraproject.org/paste/Hiojt2~Be3wgez47bKNucQ >=20 > fails > https://paste.fedoraproject.org/paste/13ZXBfJIOMsuEJFwFShBfg >=20 >=20 > diff > https://paste.fedoraproject.org/paste/FJKRwep1fWJogIDbLnkn8A >=20 > Hopefully this might help the OpenAFS devs, or someone might know what=20 > might be borking on every RHEL 7.5 beta host. It does fit with what=20 > other > 7.5 beta users have observed OpenAFS doing. Yes, now it seems like all our reports are consistent, and we just have to = wait for a developer to get a better look at what Red Hat changed in the ke= rnel that we need to adapt to. -Ben > Thanks! > - Kodiak >=20 > On Mon, Feb 5, 2018 at 12:31 PM, Stephan Wiesand=20 > > wrote: >=20 > > > > > On 04.Feb 2018, at 02:11, Jeffrey Altman wrote= : > > > > > > On 2/2/2018 6:04 PM, Kodiak Firesmith wrote: > > >> I'm relatively new to handling OpenAFS. Are these problems part=20 > > >> of a normal "kernel release; openafs update" cycle and perhaps=20 > > >> I'm getting snagged just by being too early of an adopter? I=20 > > >> wanted to raise the alarm on this and see if anything else was=20 > > >> needed from me as the reporter of the issue, but perhaps that's=20 > > >> an overreaction to what is just part of a normal process I just=20 > > >> haven't been tuned into in prior RHEL release cycles? > > > > > > > > > Kodiak, > > > > > > On RHEL, DKMS is safe to use for kernel modules that restrict=20 > > > themselves to using the restricted set of kernel interfaces (the=20 > > > RHEL KABI) that Red Hat has designated will be supported across=20 > > > the lifespan of the RHEL major version number. OpenAFS is not=20 > > > such a kernel module. As a result it is vulnerable to breakage each = and every time a new kernel is shipped. > > > > Jeffrey, > > > > the usual way to use DKMS is to either have it build a module for a=20 > > newly installed kernel or install a prebuilt module for that kernel.=20 > > It may be possible to abuse it for providing a module built for=20 > > another kernel, but I think that won't happen accidentally. > > > > You may be confusing DKMS with RHEL's "KABI tracking kmods". Those=20 > > should be safe to use within a RHEL minor release (and the SL=20 > > packaging has been using them like this since EL6.4), but aren't=20 > > across minor releases (and that's why the SL packaging modifies the=20 > > kmod handling to require a build for the minor release in question. > > > > > There are two types of failures that can occur: > > > > > > 1. a change results in failure to build the OpenAFS kernel module > > > for the new kernel > > > > > > 2. a change results in the OpenAFS kernel module building and > > > successfully loading but failing to operate correctly > > > > The latter shouldn't happen within a minor release, but can across=20 > > minor releases. > > > > > It is the second of these possibilities that has taken place with=20 > > > the release of the 3.10.0-830.el7 kernel shipped as part of the=20 > > > RHEL 7.5 > > beta. > > > > > > Are you an early adopter of RHEL 7.5 beta? Absolutely, its a beta=20 > > > release and as such you should expect that there will be bugs and=20 > > > that third party kernel modules that do not adhere to the KABI=20 > > > functionality might have compatibility issues. > > > > The -830 kernel can break 3rd-party modules using non-whitelisted=20 > > ABIs, whether or not they adhere to the "KABI functionality". > > > > > There was a compatibility issue with RHEL 7.4 kernel > > > (3.10.0_693.1.1.el7) as well that was only fixed in the OpenAFS=20 > > > 1.6 release series this past week as part of 1.6.22.2: > > > > > > http://www.openafs.org/dl/openafs/1.6.22.2/RELNOTES-1.6.22.2 > > > > Yes, and this one was hard to fix. Thanks are due to Mark Vitale for=20 > > developing the fix and all those who reviewed and tested it. > > > > > Jeffrey Altman > > > AuriStor, Inc. > > > > > > P.S. - Welcome to the community. > > > > Seconded. In particular, the problem report regarding the EL7.5beta=20 > > kernel was absolutely appropriate. > > > > -- > > Stephan Wiesand > > DESY - DV - > > Platanenallee 6 > > 15738 Zeuthen, Germany > > > > > > _______________________________________________ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info From stephan.wiesand@desy.de Fri Mar 2 09:14:48 2018 From: stephan.wiesand@desy.de (Stephan Wiesand) Date: Fri, 2 Mar 2018 10:14:48 +0100 Subject: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up In-Reply-To: References: <54924524-154a-bee0-1719-77f8af636f63@auristor.com> <20180209000156.GM12363@mit.edu> Message-ID: <0E9F9FB1-74D2-4FB5-A24E-838E6DF59429@desy.de> Hello, > On 2. Mar 2018, at 09:47, Anders Nordin = wrote: >=20 > Hello, >=20 > Is there any progress on this issue? incidentally, Mark uploaded https://gerrit.openafs.org/12935 a couple of = hours ago. It's probably not final since it seems to cause build = failures on some older platforms. But it's certainly worth a try on = EL7.5 beta systems. It would also be interesting to know on which other = platforms it fails to build (or work). > Can we expect a stable release for RHEL 7.5? Once we have a change confirmed to fix the EL7.5 issue and not break = other platforms, yes. Whether it will be available quite in time for 7.5 = GA is hard to say. You can help... Best regards, Stephan > MVH > Anders >=20 > -----Original Message----- > From: openafs-info-admin@openafs.org = [mailto:openafs-info-admin@openafs.org] On Behalf Of Benjamin Kaduk > Sent: den 9 februari 2018 01:02 > To: Kodiak Firesmith > Cc: openafs-info > Subject: Re: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel = lock up >=20 > On Wed, Feb 07, 2018 at 11:46:28AM -0500, Kodiak Firesmith wrote: >> Hello again All, >>=20 >> As part of continued testing, I've been able to confirm that the=20 >> SystemD double-service startup thing only happens to my hosts when=20 >> going from RHEL >> 7.4 to RHEL 7.5beta. On a test host installed directly as RHEL=20 >> 7.5beta, I get a bit farther with 1.6.18.22, in that I get to the=20 >> point where OpenAFS "kind of" works. >=20 > Thanks for tracking this down. The rpm packaging maintainers may want = to try to track down why the double-start happens in the upgrade = scenario, as that's pretty nasty behavior. >=20 >> What I'm observing is that the openafs client Kernel module (built by=20= >> DKMS) loads fine, and just so long as you know where you need to go = in=20 >> /afs, you can get there, and you can read and write files and the = OpenAFS 'fs' >> command works. But doing an 'ls' of /afs or any path underneath=20 >> results in >> "ls: reading directory /afs/: Not a directory". >>=20 >> I ran an strace of a good RHEL 7.4 host running ls on /afs, and a = RHEL=20 >> 7.5beta host running ls on /afs and have created pastebins of both, = as=20 >> well as an inline diff. >>=20 >> All can be seen at the following locations: >>=20 >> works >> https://paste.fedoraproject.org/paste/Hiojt2~Be3wgez47bKNucQ >>=20 >> fails >> https://paste.fedoraproject.org/paste/13ZXBfJIOMsuEJFwFShBfg >>=20 >>=20 >> diff >> https://paste.fedoraproject.org/paste/FJKRwep1fWJogIDbLnkn8A >>=20 >> Hopefully this might help the OpenAFS devs, or someone might know = what=20 >> might be borking on every RHEL 7.5 beta host. It does fit with what=20= >> other >> 7.5 beta users have observed OpenAFS doing. >=20 > Yes, now it seems like all our reports are consistent, and we just = have to wait for a developer to get a better look at what Red Hat = changed in the kernel that we need to adapt to. >=20 > -Ben >=20 >> Thanks! >> - Kodiak >>=20 >> On Mon, Feb 5, 2018 at 12:31 PM, Stephan Wiesand=20 >> >> wrote: >>=20 >>>=20 >>>> On 04.Feb 2018, at 02:11, Jeffrey Altman = wrote: >>>>=20 >>>> On 2/2/2018 6:04 PM, Kodiak Firesmith wrote: >>>>> I'm relatively new to handling OpenAFS. Are these problems part=20= >>>>> of a normal "kernel release; openafs update" cycle and perhaps=20 >>>>> I'm getting snagged just by being too early of an adopter? I=20 >>>>> wanted to raise the alarm on this and see if anything else was=20 >>>>> needed from me as the reporter of the issue, but perhaps that's=20 >>>>> an overreaction to what is just part of a normal process I just=20 >>>>> haven't been tuned into in prior RHEL release cycles? >>>>=20 >>>>=20 >>>> Kodiak, >>>>=20 >>>> On RHEL, DKMS is safe to use for kernel modules that restrict=20 >>>> themselves to using the restricted set of kernel interfaces (the=20 >>>> RHEL KABI) that Red Hat has designated will be supported across=20 >>>> the lifespan of the RHEL major version number. OpenAFS is not=20 >>>> such a kernel module. As a result it is vulnerable to breakage = each and every time a new kernel is shipped. >>>=20 >>> Jeffrey, >>>=20 >>> the usual way to use DKMS is to either have it build a module for a=20= >>> newly installed kernel or install a prebuilt module for that kernel.=20= >>> It may be possible to abuse it for providing a module built for=20 >>> another kernel, but I think that won't happen accidentally. >>>=20 >>> You may be confusing DKMS with RHEL's "KABI tracking kmods". Those=20= >>> should be safe to use within a RHEL minor release (and the SL=20 >>> packaging has been using them like this since EL6.4), but aren't=20 >>> across minor releases (and that's why the SL packaging modifies the=20= >>> kmod handling to require a build for the minor release in question. >>>=20 >>>> There are two types of failures that can occur: >>>>=20 >>>> 1. a change results in failure to build the OpenAFS kernel module >>>> for the new kernel >>>>=20 >>>> 2. a change results in the OpenAFS kernel module building and >>>> successfully loading but failing to operate correctly >>>=20 >>> The latter shouldn't happen within a minor release, but can across=20= >>> minor releases. >>>=20 >>>> It is the second of these possibilities that has taken place with=20= >>>> the release of the 3.10.0-830.el7 kernel shipped as part of the=20 >>>> RHEL 7.5 >>> beta. >>>>=20 >>>> Are you an early adopter of RHEL 7.5 beta? Absolutely, its a beta=20= >>>> release and as such you should expect that there will be bugs and=20= >>>> that third party kernel modules that do not adhere to the KABI=20 >>>> functionality might have compatibility issues. >>>=20 >>> The -830 kernel can break 3rd-party modules using non-whitelisted=20 >>> ABIs, whether or not they adhere to the "KABI functionality". >>>=20 >>>> There was a compatibility issue with RHEL 7.4 kernel >>>> (3.10.0_693.1.1.el7) as well that was only fixed in the OpenAFS=20 >>>> 1.6 release series this past week as part of 1.6.22.2: >>>>=20 >>>> http://www.openafs.org/dl/openafs/1.6.22.2/RELNOTES-1.6.22.2 >>>=20 >>> Yes, and this one was hard to fix. Thanks are due to Mark Vitale for=20= >>> developing the fix and all those who reviewed and tested it. >>>=20 >>>> Jeffrey Altman >>>> AuriStor, Inc. >>>>=20 >>>> P.S. - Welcome to the community. >>>=20 >>> Seconded. In particular, the problem report regarding the EL7.5beta=20= >>> kernel was absolutely appropriate. --=20 Stephan Wiesand DESY -DV- Platanenallee 6 15738 Zeuthen, Germany From gsgatlin@ncsu.edu Fri Mar 2 11:40:52 2018 From: gsgatlin@ncsu.edu (Gary Gatling) Date: Fri, 2 Mar 2018 06:40:52 -0500 Subject: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up In-Reply-To: <0E9F9FB1-74D2-4FB5-A24E-838E6DF59429@desy.de> References: <54924524-154a-bee0-1719-77f8af636f63@auristor.com> <20180209000156.GM12363@mit.edu> <0E9F9FB1-74D2-4FB5-A24E-838E6DF59429@desy.de> Message-ID: --f403045ea33cdc0e2505666c72a8 Content-Type: text/plain; charset="UTF-8" On Fri, Mar 2, 2018 at 4:14 AM, Stephan Wiesand wrote: > > > Once we have a change confirmed to fix the EL7.5 issue and not break other > platforms, yes. Whether it will be available quite in time for 7.5 GA is > hard to say. You can help... > > I will test this patch out later today and let you guys know what I find out. Thanks a lot. --f403045ea33cdc0e2505666c72a8 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Fri, Mar 2, 2018 at 4:14 AM, Stephan Wiesand <<= a href=3D"mailto:stephan.wiesand@desy.de" target=3D"_blank">stephan.wiesand= @desy.de> wrote:


Once we have a change confirmed to fix the EL7.5 issue and not break= other platforms, yes. Whether it will be available quite in time for 7.5 G= A is hard to say. You can help...


I will test this patch o= ut later today and let you guys know what I find out. Thanks a lot.
--f403045ea33cdc0e2505666c72a8-- From stephan.wiesand@desy.de Fri Mar 2 16:05:02 2018 From: stephan.wiesand@desy.de (Stephan Wiesand) Date: Fri, 2 Mar 2018 17:05:02 +0100 Subject: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up In-Reply-To: References: <54924524-154a-bee0-1719-77f8af636f63@auristor.com> <20180209000156.GM12363@mit.edu> <0E9F9FB1-74D2-4FB5-A24E-838E6DF59429@desy.de> Message-ID: > On 02.Mar 2018, at 12:40, Gary Gatling wrote: >=20 >> On Fri, Mar 2, 2018 at 4:14 AM, Stephan Wiesand = wrote: >>=20 >>=20 >> Once we have a change confirmed to fix the EL7.5 issue and not break = other platforms, yes. Whether it will be available quite in time for 7.5 = GA is hard to say. You can help... >>=20 >>=20 >> I will test this patch out later today and let you guys know what I = find out. Thanks a lot. Make sure you grab the patch from set 3 (the latest revision). It might = be the final solution. --=20 Stephan Wiesand DESY - DV - Platanenallee 6 15738 Zeuthen, Germany From gsgatlin@ncsu.edu Fri Mar 2 22:11:19 2018 From: gsgatlin@ncsu.edu (Gary Gatling) Date: Fri, 2 Mar 2018 17:11:19 -0500 Subject: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up In-Reply-To: References: <54924524-154a-bee0-1719-77f8af636f63@auristor.com> <20180209000156.GM12363@mit.edu> <0E9F9FB1-74D2-4FB5-A24E-838E6DF59429@desy.de> Message-ID: --001a1142c43882522e05667541cf Content-Type: text/plain; charset="UTF-8" I tried to copy/paste the patch at: http://git.openafs.org/?p=openafs.git;a=blobdiff;f=src/afs/LINUX/osi_vnodeops.c;h=969a27b271ed3b809f1ddaa462099a5cc09d7886;hp=c1acca962337dff1cf66916c1e3e876bd8468e54;hb=a72dafafddaaa5bfe86c067a605aeffa16572c51;hpb=6d74e3d6a1becf86cec30efc2d01a5692167afe1 But it failed for me with openafs-1.6.22.2-src.tar.bz2 patching file src/afs/LINUX/osi_vnodeops.c Hunk #1 FAILED at 53. Hunk #2 succeeded at 296 (offset -6 lines). Hunk #3 succeeded at 378 (offset -7 lines). Hunk #4 FAILED at 455. Hunk #5 FAILED at 475. Hunk #6 FAILED at 798. 4 out of 6 hunks FAILED -- saving rejects to file src/afs/LINUX/osi_vnodeops.c.rej So I made my own patch based on that one. https://pastebin.com/NZsUz9Jg In RHEL 7.5 beta edition vm on kernel 3.10.0-830.el7.x86_64 openafs Works like a CHAMP. :) Can list directories again. Can also edit files in afs. Whew. Did further testing to be paranoid. Testing listing directories and editing a file in afs path. Patch was applied across all distros below... centos 6 32 bit kernel 2.6.32-696.20.1.el6.i686: works centos 6 64 bit kernel 2.6.32-696.20.1.el6.x86_64: works centos 7.4 64 bit kernel 3.10.0-693.17.1.el7.x86_64: works fedora 26 64 bit kernel 4.15.6-200.fc26.x86_64: works fedora 27 64 bit kernel 4.15.6-300.fc27.x86_64: works Since all tests succeeded I went ahead and committed and pushed to github.com for my packages. https://github.com/gsgatlin/openafs-rpms/commit/fd61c9ff2c21404fba5276d7f3919ef1e6ab545d Thank you very much! On Fri, Mar 2, 2018 at 11:05 AM, Stephan Wiesand wrote: > > > On 02.Mar 2018, at 12:40, Gary Gatling wrote: > > > >> On Fri, Mar 2, 2018 at 4:14 AM, Stephan Wiesand < > stephan.wiesand@desy.de> wrote: > >> > >> > >> Once we have a change confirmed to fix the EL7.5 issue and not break > other platforms, yes. Whether it will be available quite in time for 7.5 GA > is hard to say. You can help... > >> > >> > >> I will test this patch out later today and let you guys know what I > find out. Thanks a lot. > > Make sure you grab the patch from set 3 (the latest revision). It might be > the final solution. > > -- > Stephan Wiesand > DESY - DV - > Platanenallee 6 > 15738 Zeuthen, Germany > > > --001a1142c43882522e05667541cf Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I tried to copy/paste the patch at:



But it failed for me with=C2=A0openafs-1.6.22.2-src.tar.bz2

patching file src/afs/LINUX/osi_vnodeops.c
Hunk= #1 FAILED at 53.
Hunk #2 succeeded at 296 (offset -6 lines).
Hunk #3 succeeded at 378 (offset -7 lines).
Hunk #4 FAILED= at 455.
Hunk #5 FAILED at 475.
Hunk #6 FAILED at 798.<= /div>
4 out of 6 hunks FAILED -- saving rejects to file src/afs/LINUX/o= si_vnodeops.c.rej


So I made my own = patch based on that one.


=
In RHEL 7.5 beta edition vm on kernel=C2=A03.10.0-830.el7.x86_64 opena= fs Works like a CHAMP. :) Can list directories again. Can also edit files i= n afs.=C2=A0 Whew.

Did further testing to be paran= oid. Testing listing directories and editing a file in afs path.
=
Patch was applied across all distros below...

=
centos 6 32 bit kernel=C2=A02.6.32-696.20.1.el6.i686: works
centos 6 64 bit kernel=C2=A02.6.32-696.20.1.el6.x86_64: works
centos 7.4 64 bit kernel=C2=A03.10.0-693.17.1.el7.x86_64: works
fedora 26 64 bit kernel=C2=A04.15.6-200.fc26.x86_64: works
fedo= ra 27 64 bit kernel=C2=A0=C2=A04.15.6-300.fc27.x86_64: works

=
Since all tests succeeded I went ahead and committed and pushed = to github.com for my packages.
=
Thank you very much!


<= br>


On Fri, Mar 2, 2018 at 11:05 AM, Stephan Wiesand <steph= an.wiesand@desy.de> wrote:
=

> On 02.Mar 2018, at 12:40, Gary Gatling <gsgatlin@ncsu.edu> wrote:
>
>> On Fri, Mar 2, 2018 at 4:14 AM, Stephan Wiesand <stephan.wiesand@desy.de> wrote:
>>
>>
>> Once we have a change confirmed to fix the EL7.5 issue and not bre= ak other platforms, yes. Whether it will be available quite in time for 7.5= GA is hard to say. You can help...
>>
>>
>> I will test this patch out later today and let you guys know what = I find out. Thanks a lot.

Make sure you grab the patch from set 3 (the latest revision). = It might be the final solution.

--
Stephan Wiesand
DESY - DV -
Platanenallee 6
15738 Zeuthen, Germany



--001a1142c43882522e05667541cf-- From mvanderw@nd.edu Sat Mar 3 14:50:15 2018 From: mvanderw@nd.edu (Matt Vander Werf) Date: Sat, 3 Mar 2018 09:50:15 -0500 Subject: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up In-Reply-To: References: <54924524-154a-bee0-1719-77f8af636f63@auristor.com> <20180209000156.GM12363@mit.edu> <0E9F9FB1-74D2-4FB5-A24E-838E6DF59429@desy.de> Message-ID: --94eb2c1c16be665219056683389e Content-Type: text/plain; charset="UTF-8" Yes, that patch was added to the master branch. They usually have to backport patches into the 1.6.x branch before they will work in that codebase as well. But...I was able to apply that patch from Gerrit to the 1.8.0pre5 release and build RPMs off of that. From my testing, that fix appears to work great for 1.8.x on RHEL 7.5 beta! I am able to ls in any /afs directory successfully now! Using Gary's (unofficial) 1.6.x patch, I was also able to replicate Gary's success on RHEL 7.5 beta when applying the patch to the latest 1.6.x release! If the "official" 1.6.x backport fix differs from Gary's (for whatever reason...not saying it will), I'd be happy to test out that backported patch as well. Thanks for all your great work! Looking forward to a new 1.8.x and 1.6.x release with this fix in place! -- Matt Vander Werf HPC System Administrator University of Notre Dame Center for Research Computing - Union Station 506 W. South Street South Bend, IN 46601 Phone: (574) 631-0692 On Fri, Mar 2, 2018 at 5:11 PM, Gary Gatling wrote: > I tried to copy/paste the patch at: > > > http://git.openafs.org/?p=openafs.git;a=blobdiff;f=src/afs/ > LINUX/osi_vnodeops.c;h=969a27b271ed3b809f1ddaa462099a5cc09d7 > 886;hp=c1acca962337dff1cf66916c1e3e876bd8468e54;hb=a72dafafd > daaa5bfe86c067a605aeffa16572c51;hpb=6d74e3d6a1becf86cec30efc > 2d01a5692167afe1 > > But it failed for me with openafs-1.6.22.2-src.tar.bz2 > > patching file src/afs/LINUX/osi_vnodeops.c > Hunk #1 FAILED at 53. > Hunk #2 succeeded at 296 (offset -6 lines). > Hunk #3 succeeded at 378 (offset -7 lines). > Hunk #4 FAILED at 455. > Hunk #5 FAILED at 475. > Hunk #6 FAILED at 798. > 4 out of 6 hunks FAILED -- saving rejects to file > src/afs/LINUX/osi_vnodeops.c.rej > > > So I made my own patch based on that one. > > https://pastebin.com/NZsUz9Jg > > In RHEL 7.5 beta edition vm on kernel 3.10.0-830.el7.x86_64 openafs Works > like a CHAMP. :) Can list directories again. Can also edit files in afs. > Whew. > > Did further testing to be paranoid. Testing listing directories and > editing a file in afs path. > > Patch was applied across all distros below... > > centos 6 32 bit kernel 2.6.32-696.20.1.el6.i686: works > centos 6 64 bit kernel 2.6.32-696.20.1.el6.x86_64: works > centos 7.4 64 bit kernel 3.10.0-693.17.1.el7.x86_64: works > fedora 26 64 bit kernel 4.15.6-200.fc26.x86_64: works > fedora 27 64 bit kernel 4.15.6-300.fc27.x86_64: works > > Since all tests succeeded I went ahead and committed and pushed to > github.com for my packages. > > https://github.com/gsgatlin/openafs-rpms/commit/fd61c9ff2c21 > 404fba5276d7f3919ef1e6ab545d > > Thank you very much! > > > > > > On Fri, Mar 2, 2018 at 11:05 AM, Stephan Wiesand > wrote: > >> >> > On 02.Mar 2018, at 12:40, Gary Gatling wrote: >> > >> >> On Fri, Mar 2, 2018 at 4:14 AM, Stephan Wiesand < >> stephan.wiesand@desy.de> wrote: >> >> >> >> >> >> Once we have a change confirmed to fix the EL7.5 issue and not break >> other platforms, yes. Whether it will be available quite in time for 7.5 GA >> is hard to say. You can help... >> >> >> >> >> >> I will test this patch out later today and let you guys know what I >> find out. Thanks a lot. >> >> Make sure you grab the patch from set 3 (the latest revision). It might >> be the final solution. >> >> -- >> Stephan Wiesand >> DESY - DV - >> Platanenallee 6 >> >> 15738 Zeuthen, Germany >> >> >> > --94eb2c1c16be665219056683389e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Yes, that patch was added to the master bra= nch. They usually have to backport patches into the 1.6.x branch before the= y will work in that codebase as well.

But...I was able to appl= y that patch from Gerrit to the 1.8.0pre5 release and build RPMs off of tha= t. From my testing, that fix appears to work great for 1.8.x on RHEL 7.5 be= ta! I am able to ls in any /afs directory successfully now!

Us= ing Gary's (unofficial) 1.6.x patch, I was also able to replicate Gary&= #39;s success on RHEL 7.5 beta when applying the patch to the latest 1.6.x = release! If the "official" 1.6.x backport fix differs from Gary&#= 39;s (for whatever reason...not saying it will), I'd be happy to test o= ut that backported patch as well.

Thanks for all your great wo= rk! Looking forward to a new 1.8.x and 1.6.x release with this fix in place= !

--
Matt Vander Werf
HPC System Adminis= trator
University of Notre Dame
Center for Research Computing - Union= Station
506 W. South Street
South Bend, IN 46601
Phone: (574= ) 631-0692

On Fri, Mar 2, 2018 at 5:11 PM, Gary Gatling= <gsgatlin@ncsu.edu> wrote:
I tried to copy/paste the patch at:

<= /div>


But it = failed for me with=C2=A0openafs-1.6.22.2-src.tar.bz2

patching file src/afs/LINUX/osi_vnodeops.c
Hunk #1 FAILED= at 53.
Hunk #2 succeeded at 296 (offset -6 lines).
Hun= k #3 succeeded at 378 (offset -7 lines).
Hunk #4 FAILED at 455.
Hunk #5 FAILED at 475.
Hunk #6 FAILED at 798.
= 4 out of 6 hunks FAILED -- saving rejects to file src/afs/LINUX/osi_vnodeop= s.c.rej


So I made my own patch= based on that one.


In RHEL 7.5 beta edition vm on kernel=C2=A03.10.0-830.el7= .x86_64 openafs Works like a CHAMP. :) Can list directories again. Can also= edit files in afs.=C2=A0 Whew.

Did further testin= g to be paranoid. Testing listing directories and editing a file in afs pat= h.

Patch was applied across all distros below...

centos 6 32 bit kernel=C2=A02.6.32-696.20.1.el6.i68= 6: works
centos 6 64 bit kernel=C2=A02.6.32-696.20.1.el6.x86= _64: works
centos 7.4 64 bit kernel=C2=A03.10.0-693.17.1.el7= .x86_64: works
fedora 26 64 bit kernel=C2=A04.15.6-200.fc26.= x86_64: works
fedora 27 64 bit kernel=C2=A0=C2=A04.15.6-300.fc27.= x86_64: works

Since all tests succeeded I wen= t ahead and committed and pushed to github.com for my packages.

<= div>
Thank you very much!


=


On Fri, Mar 2, 2018 at 11:05 AM, Stephan Wiesan= d <stephan.wiesand@desy.de> wrote:

> On 02.Mar 2018, at 12:40, Gary Gatling <gsgatlin@ncsu.edu> wrote:
>
>> On Fri, Mar 2, 2018 at 4:14 AM, Stephan Wiesand <stephan.wiesand@desy.de&= gt; wrote:
>>
>>
>> Once we have a change confirmed to fix the EL7.5 issue and not bre= ak other platforms, yes. Whether it will be available quite in time for 7.5= GA is hard to say. You can help...
>>
>>
>> I will test this patch out later today and let you guys know what = I find out. Thanks a lot.

Make sure you grab the patch from set 3 (the latest revision). = It might be the final solution.

--
Stephan Wiesand
DESY - DV -
Platanenallee 6
15738 Zeuthen, Germany




--94eb2c1c16be665219056683389e-- From gsgatlin@ncsu.edu Sat Mar 3 15:15:10 2018 From: gsgatlin@ncsu.edu (Gary Gatling) Date: Sat, 3 Mar 2018 10:15:10 -0500 Subject: [OpenAFS] question about authentication with kerberos and Default principal Message-ID: --001a1142c4381d75a90566838f67 Content-Type: text/plain; charset="UTF-8" Recently I decided to play around with some alternative architectures on fedora with virt-manager/qemu. So I set up some power machines. (ppc64 and ppc64le) I also made some arm machines but I gather openafs isn't quite ready yet for arm in 1.6.22.2. I was able to compile openafs rpms for ppc64. I did not try ppc64le yet. Its very slow to build in a emulator. Surprisingly openafs builds and I am able to start the service and list afs directories. But I can't authenticate to kerberos like I can on x86_64. I feel like I'm missing something basic but I'm unsure what it is. When I run kinit on x86_64, I get [gsgatlin@t540p ~]$ kinit gsgatlin Password for gsgatlin@EOS.NCSU.EDU: [gsgatlin@t540p ~]$ klist Ticket cache: KCM:1000 Default principal: gsgatlin@EOS.NCSU.EDU Valid starting Expires Service principal 03/03/2018 09:55:22 03/04/2018 07:10:22 krbtgt/EOS.NCSU.EDU@EOS.NCSU.EDU renew until 03/10/2018 09:55:17 but on ppc64 emulator, I get [gsgatlin@localhost bin]$ kinit gsgatlin Password for gsgatlin@EOS.NCSU.EDU: [gsgatlin@localhost bin]$ klist Ticket cache: KCM:1000:53854 Default principal: @EOS.NCSU.EDU Valid starting Expires Service principal 03/03/2018 09:56:23 03/04/2018 07:11:23 krbtgt/EOS.NCSU.EDU@EOS.NCSU.EDU for client gsgatlin@EOS.NCSU.EDU, renew until 03/10/2018 09:56:17 Notice the default principal says @EOS.NCSU.EDU instead of gsgatlin@EOS.NCSU.EDU like it did on x86_64. So when I run aklog on ppc64 it fails [gsgatlin@localhost bin]$ aklog -d -c eos.ncsu.edu -k EOS.NCSU.EDU Authenticating to cell eos.ncsu.edu (server eos01db.unity.ncsu.edu). We were told to authenticate to realm EOS.NCSU.EDU. Getting tickets: afs/eos.ncsu.edu@EOS.NCSU.EDU Kerberos error code returned by get_cred : -1765328243 aklog: Couldn't get eos.ncsu.edu AFS tickets: aklog: unknown RPC error (-1765328243) while getting AFS tickets but on x86_64 (either on virt-manager or a real pc) I get [gsgatlin@t540p ~]$ aklog -d -c eos.ncsu.edu -k EOS.NCSU.EDU Authenticating to cell eos.ncsu.edu (server eos01db.unity.ncsu.edu). We were told to authenticate to realm EOS.NCSU.EDU. Getting tickets: afs/eos.ncsu.edu@EOS.NCSU.EDU Using Kerberos V5 ticket natively About to resolve name gsgatlin to id in cell eos.ncsu.edu. Id 19149 Set username to AFS ID 19149 Setting tokens. AFS ID 19149 @ eos.ncsu.edu Here is a link to my /etc/krb5.conf file on both systems: https://pastebin.com/3HHP15c0 Does anyone know why it would work on one architecture (x86_64) but fail on another (ppc64) ? Is my /etc/krb5.conf missing something? kinit is provided by red hat so I think I can't have messed up that particular binary. [gsgatlin@localhost bin]$ which kinit /usr/bin/kinit [gsgatlin@localhost bin]$ rpm -qf /usr/bin/kinit krb5-workstation-1.15.2-7.fc27.ppc64 Thanks a lot for any ideas anyone may have. I feel like I was close to getting everything working. --001a1142c4381d75a90566838f67 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Recently I decided to play around with some alternative ar= chitectures on fedora with virt-manager/qemu. So I set up some power machin= es. (ppc64 and ppc64le)
I also made some arm machines but I gather open= afs isn't quite ready yet for arm in 1.6.22.2.

I= was able to compile openafs rpms for ppc64.=C2=A0 I did not try ppc64le ye= t. Its very slow to build in a emulator. Surprisingly openafs builds and I = am able to start the service and list afs directories. But I can't auth= enticate to kerberos like I can on x86_64. I feel like I'm missing some= thing basic but I'm unsure what it is.

When I = run kinit on x86_64, I get

[gsgatlin@t540p ~]= $ kinit gsgatlin
Password for gsgatlin@EOS.NCSU.EDU:=C2=A0
[gsgatlin@t540p ~]$ klist<= /div>
Ticket cache: KCM:1000
Default principal: gsgatlin@EOS.NCSU.EDU

<= div>Valid starting=C2=A0 =C2=A0 =C2=A0 =C2=A0Expires=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 Service principal
03/03/2018 09:55:22= =C2=A0 03/04/2018 07:10:22=C2=A0 krbtgt/EOS.NCSU.EDU@EOS.NCSU.EDU
renew until 03/10/2018 09:55:17


but on ppc64 emulator, I get

[gsgatlin@localhost bin]$ kinit gsgatlin
Password for gsgatlin@EOS.NCSU.EDU:=C2=A0
[gsgatlin@localhost bin]$ klist
for client gsgatlin@EOS.NCSU.EDU, renew until 03/10/2018 09:56:17

Notice the default principal says=C2=A0=C2=A0@<= a href=3D"http://EOS.NCSU.EDU">EOS.NCSU.EDU instead of=C2=A0gsgatlin@EOS.NCSU.EDU like it did on x86_= 64.

So when I run aklog on ppc64 it fails

[gsgatlin@localhost bin]$ aklog -d -c eos.ncsu.edu -k EOS.NC= SU.EDU
Authenticating to cell eos.ncsu.edu (server eos01db= .unity.ncsu.edu).
We were told to authenticate to realm EOS.NCSU.EDU.
Getting tickets: afs= /eos.ncsu.edu@EOS.NCSU.EDU=
Kerberos error code returned by get_cred : -1765328243
aklog: Couldn't get eos.ncsu.edu A= FS tickets:
aklog: unknown RPC error (-1765328243) while getting = AFS tickets


but on x86_64 (ei= ther on virt-manager or a real pc) I get

[gsg= atlin@t540p ~]$ aklog -d -c eos.ncsu.edu -k EOS.NCSU.EDU
Authenticat= ing to cell eos.ncsu.edu (server eos01db.unity.ncsu.edu).
= We were told to authenticate to realm EOS.N= CSU.EDU.
Getting tickets: afs/eos.ncsu.edu@EOS.NCSU.EDU
Using Kerberos V5 tic= ket natively
About to resolve name gsgatlin to id in cell eos.ncsu.edu.
Id 19149
Set= username to AFS ID 19149
Setting tokens. AFS ID 19149 @ eos.ncsu.edu

He= re is a link to my /etc/krb5.conf file on both systems:


Does anyone know why it would work on o= ne architecture (x86_64) but fail on another (ppc64) ? Is my /etc/krb5.conf= missing something? kinit is provided by red hat so I think I can't hav= e messed up that particular binary.

[gsgatlin= @localhost bin]$ which kinit
/usr/bin/kinit
[gsgatlin@l= ocalhost bin]$ rpm -qf /usr/bin/kinit
krb5-workstation-1.15.2-7.f= c27.ppc64

Thanks a lot for any ideas anyone = may have.=C2=A0 I feel like I was close to getting everything working.


--001a1142c4381d75a90566838f67-- From deengert@gmail.com Sat Mar 3 15:42:50 2018 From: deengert@gmail.com (Douglas E Engert) Date: Sat, 3 Mar 2018 09:42:50 -0600 Subject: [OpenAFS] question about authentication with kerberos and Default principal In-Reply-To: References: Message-ID: Looks like the hostname is "localhost" on the ppc64. Did you miss a step? On 3/3/2018 9:15 AM, Gary Gatling wrote: > Recently I decided to play around with some alternative architectures on fedora with virt-manager/qemu. So I set up some power machines. (ppc64 and ppc64le) > I also made some arm machines but I gather openafs isn't quite ready yet for arm in 1.6.22.2. > > I was able to compile openafs rpms for ppc64.  I did not try ppc64le yet. Its very slow to build in a emulator. Surprisingly openafs builds and I am able to start the service and list afs directories. > But I can't authenticate to kerberos like I can on x86_64. I feel like I'm missing something basic but I'm unsure what it is. > > When I run kinit on x86_64, I get > > [gsgatlin@t540p ~]$ kinit gsgatlin > Password for gsgatlin@EOS.NCSU.EDU : > [gsgatlin@t540p ~]$ klist > Ticket cache: KCM:1000 > Default principal: gsgatlin@EOS.NCSU.EDU > > Valid starting       Expires              Service principal > 03/03/2018 09:55:22  03/04/2018 07:10:22  krbtgt/EOS.NCSU.EDU@EOS.NCSU.EDU > renew until 03/10/2018 09:55:17 > > > but on ppc64 emulator, I get > > [gsgatlin@localhost bin]$ kinit gsgatlin > Password for gsgatlin@EOS.NCSU.EDU : > [gsgatlin@localhost bin]$ klist > Ticket cache: KCM:1000:53854 > Default principal: @EOS.NCSU.EDU > > Valid starting       Expires              Service principal > 03/03/2018 09:56:23  03/04/2018 07:11:23  krbtgt/EOS.NCSU.EDU@EOS.NCSU.EDU > for client gsgatlin@EOS.NCSU.EDU , renew until 03/10/2018 09:56:17 > > Notice the default principal says  @EOS.NCSU.EDU instead of gsgatlin@EOS.NCSU.EDU like it did on x86_64. > > So when I run aklog on ppc64 it fails > > [gsgatlin@localhost bin]$ aklog -d -c eos.ncsu.edu -k EOS.NCSU.EDU > Authenticating to cell eos.ncsu.edu (server eos01db.unity.ncsu.edu ). > We were told to authenticate to realm EOS.NCSU.EDU . > Getting tickets: afs/eos.ncsu.edu@EOS.NCSU.EDU > Kerberos error code returned by get_cred : -1765328243 > aklog: Couldn't get eos.ncsu.edu AFS tickets: > aklog: unknown RPC error (-1765328243) while getting AFS tickets > > > but on x86_64 (either on virt-manager or a real pc) I get > > [gsgatlin@t540p ~]$ aklog -d -c eos.ncsu.edu -k EOS.NCSU.EDU > Authenticating to cell eos.ncsu.edu (server eos01db.unity.ncsu.edu ). > We were told to authenticate to realm EOS.NCSU.EDU . > Getting tickets: afs/eos.ncsu.edu@EOS.NCSU.EDU > Using Kerberos V5 ticket natively > About to resolve name gsgatlin to id in cell eos.ncsu.edu . > Id 19149 > Set username to AFS ID 19149 > Setting tokens. AFS ID 19149 @ eos.ncsu.edu > > Here is a link to my /etc/krb5.conf file on both systems: > > https://pastebin.com/3HHP15c0 > > Does anyone know why it would work on one architecture (x86_64) but fail on another (ppc64) ? Is my /etc/krb5.conf missing something? kinit is provided by red hat so I think I can't have messed up > that particular binary. > > [gsgatlin@localhost bin]$ which kinit > /usr/bin/kinit > [gsgatlin@localhost bin]$ rpm -qf /usr/bin/kinit > krb5-workstation-1.15.2-7.fc27.ppc64 > > Thanks a lot for any ideas anyone may have.  I feel like I was close to getting everything working. > > -- Douglas E. Engert From gsgatlin@ncsu.edu Sat Mar 3 16:17:04 2018 From: gsgatlin@ncsu.edu (Gary Gatling) Date: Sat, 3 Mar 2018 11:17:04 -0500 Subject: [OpenAFS] question about authentication with kerberos and Default principal In-Reply-To: References: Message-ID: --94eb2c070e7275fac20566846cd3 Content-Type: text/plain; charset="UTF-8" On Sat, Mar 3, 2018 at 10:42 AM, Douglas E Engert wrote: > Looks like the hostname is "localhost" on the ppc64. > Did you miss a step? I tried in in another vm that is x86_64 with same krb5.conf. The first time I was using the "parent OS" whch I set the hostname on. Sorry about that... [gsgatlin@localhost ~]$ kinit gsgatlin Password for gsgatlin@EOS.NCSU.EDU: [gsgatlin@localhost ~]$ klist Ticket cache: KCM:1000 Default principal: gsgatlin@EOS.NCSU.EDU Valid starting Expires Service principal 03/03/2018 11:09:59 03/04/2018 08:24:59 krbtgt/EOS.NCSU.EDU@EOS.NCSU.EDU renew until 03/10/2018 11:09:52 [gsgatlin@localhost ~]$ aklog -d -c eos.ncsu.edu -k EOS.NCSU.EDU Authenticating to cell eos.ncsu.edu (server eos01db.unity.ncsu.edu). We were told to authenticate to realm EOS.NCSU.EDU. Getting tickets: afs/eos.ncsu.edu@EOS.NCSU.EDU Using Kerberos V5 ticket natively About to resolve name gsgatlin to id in cell eos.ncsu.edu. Id 19149 Set username to AFS ID 19149 Setting tokens. AFS ID 19149 @ eos.ncsu.edu [gsgatlin@localhost ~]$ hostname localhost.localdomain [gsgatlin@localhost ~]$ uname -a Linux localhost.localdomain 4.15.6-300.fc27.x86_64 #1 SMP Mon Feb 26 18:43:03 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux [gsgatlin@localhost ~]$ same commands on vm with ppc64 [gsgatlin@localhost ~]$ kinit gsgatlin Password for gsgatlin@EOS.NCSU.EDU: [gsgatlin@localhost ~]$ klist Ticket cache: KCM:1000:64581 Default principal: @EOS.NCSU.EDU Valid starting Expires Service principal 03/03/2018 11:14:07 03/04/2018 08:29:07 krbtgt/EOS.NCSU.EDU@EOS.NCSU.EDU for client gsgatlin@EOS.NCSU.EDU, renew until 03/10/2018 11:14:00 [gsgatlin@localhost ~]$ aklog -d -c eos.ncsu.edu -k EOS.NCSU.EDU Authenticating to cell eos.ncsu.edu (server eos01db.unity.ncsu.edu). We were told to authenticate to realm EOS.NCSU.EDU. Getting tickets: afs/eos.ncsu.edu@EOS.NCSU.EDU Kerberos error code returned by get_cred : -1765328243 aklog: Couldn't get eos.ncsu.edu AFS tickets: aklog: unknown RPC error (-1765328243) while getting AFS tickets [gsgatlin@localhost ~]$ hostname localhost.localdomain [gsgatlin@localhost ~]$ uname -a Linux localhost.localdomain 4.15.6-300.fc27.ppc64 #1 SMP Mon Feb 26 18:18:35 UTC 2018 ppc64 ppc64 ppc64 GNU/Linux I noticed that the Ticket cache: was different as well. Also, unrelated but it looks like I can't test ppc64le. I get this when I try to compile it. make[1]: *** No rule to make target 'param.ppc64le_linux26.h', needed by 'param.h.new'. Stop. Oh well. :( --94eb2c070e7275fac20566846cd3 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Sat, Mar 3, 2018 at 10:42 AM, Douglas E Engert <= ;deengert@gmail.com= > wrote:
Looks like the hostname is "localhost" on the ppc64.
Did you miss a step?

I tried in in another = vm that is x86_64 with same krb5.conf. The first time I was using the "= ;parent OS" whch I set the hostname on. Sorry about that...
=
[gsgatlin@localhost ~]$ kinit gsgatlin
Passwo= rd for gsgatlin@EOS.NCSU.EDU:= =C2=A0
[gsgatlin@localhost ~]$ klist
Ticket cache: KCM:= 1000
Default principal: = gsgatlin@EOS.NCSU.EDU

Valid starting=C2=A0 =C2= =A0 =C2=A0 =C2=A0Expires=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Se= rvice principal
03/03/2018 11:09:59=C2=A0 03/04/2018 08:24:59=C2= =A0 krbtgt/EOS.NCSU.EDU@EOS.NC= SU.EDU
renew until 03= /10/2018 11:09:52
[gsgatlin@localhost ~]$=C2=A0 aklog -d -c eos.ncsu.edu -k EOS.NCSU.EDU=C2=A0
Authenticating to cell eos.ncsu.edu (server eos01db.unity.ncsu.edu).
We were told to authenticat= e to realm EOS.NCSU.EDU.
Gett= ing tickets: afs/eos.ncsu.edu@= EOS.NCSU.EDU
Using Kerberos V5 ticket natively
Abou= t to resolve name gsgatlin to id in cell eo= s.ncsu.edu.
Id 19149
Set username to AFS ID 19149
Setting tokens. AFS ID 19149 @ eos= .ncsu.edu
[gsgatlin@localhost ~]$ hostname
localhos= t.localdomain
[gsgatlin@localhost ~]$ uname -a
Linux lo= calhost.localdomain 4.15.6-300.fc27.x86_64 #1 SMP Mon Feb 26 18:43:03 UTC 2= 018 x86_64 x86_64 x86_64 GNU/Linux
[gsgatlin@localhost ~]$=C2=A0<= /div>


same commands on vm with ppc6= 4

[gsgatlin@localhost ~]$ kinit gsgatlin
Password for gsgatlin@EOS.NC= SU.EDU:=C2=A0
[gsgatlin@localhost ~]$ klist
Ticket = cache: KCM:1000:64581
Default principal: @EOS.NCSU.EDU

Valid starting=C2=A0 =C2= =A0 =C2=A0 =C2=A0Expires=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Se= rvice principal
03/03/2018 11:14:07=C2=A0 03/04/2018 08:29:07=C2= =A0 krbtgt/EOS.NCSU.EDU@EOS.NC= SU.EDU
for client gsgatlin@EOS.NCSU.EDU, renew unti= l 03/10/2018 11:14:00
[gsgatlin@localhost ~]$ aklog -d -c eos.ncsu.edu -k EOS.NCSU.EDU=C2=A0
Authenticating to cell eos.ncsu.edu (server eos01db.unity.ncsu.edu).
We were told to authenticate = to realm EOS.NCSU.EDU.
Gettin= g tickets: afs/eos.ncsu.edu@EO= S.NCSU.EDU
Kerberos error code returned by get_cred : -176532= 8243
aklog: Couldn't get eos.= ncsu.edu AFS tickets:
aklog: unknown RPC error (-1765328243) = while getting AFS tickets
[gsgatlin@localhost ~]$ hostname
<= div>localhost.localdomain
[gsgatlin@localhost ~]$ uname -a
<= div>Linux localhost.localdomain 4.15.6-300.fc27.ppc64 #1 SMP Mon Feb 26 18:= 18:35 UTC 2018 ppc64 ppc64 ppc64 GNU/Linux

= =C2=A0I noticed that the=C2=A0Ticket cache: was different as well.

Also, unrelated but it looks like I can't test ppc= 64le. I get this when I try to compile it.

ma= ke[1]: *** No rule to make target 'param.ppc64le_linux26.h', needed= by 'param.h.new'.=C2=A0 Stop.


<= /div>
Oh well. :(
--94eb2c070e7275fac20566846cd3-- From haba@kth.se Sat Mar 3 17:14:50 2018 From: haba@kth.se (Harald Barth) Date: Sat, 03 Mar 2018 18:14:50 +0100 (CET) Subject: [OpenAFS] question about authentication with kerberos and Default principal In-Reply-To: References: Message-ID: <20180303.181450.45876910850177078.haba@habook.pdc.kth.se> Hm. If I remember correct, at least parts of the kerberos ticket in the ticket cache are endian dependent. As the principal name seems to be broken to start with, maybe the error is there. Do you have the same problems if you use the FILE: ticket cache type or the kinit and afslog from heimdal to handle tickets and tokens? Harald. From gsgatlin@ncsu.edu Sat Mar 3 20:20:39 2018 From: gsgatlin@ncsu.edu (Gary Gatling) Date: Sat, 3 Mar 2018 15:20:39 -0500 Subject: [OpenAFS] question about authentication with kerberos and Default principal In-Reply-To: <20180303.181450.45876910850177078.haba@habook.pdc.kth.se> References: <20180303.181450.45876910850177078.haba@habook.pdc.kth.se> Message-ID: --001a113fbdc8969f1f056687d3a4 Content-Type: text/plain; charset="UTF-8" On Sat, Mar 3, 2018 at 12:14 PM, Harald Barth wrote: > > Hm. If I remember correct, at least parts of the kerberos ticket in > the ticket cache are endian dependent. As the principal name seems to > be broken to start with, maybe the error is there. Do you have the > same problems if you use the FILE: ticket cache type or the kinit and > afslog from heimdal to handle tickets and tokens? > > Does heimdal-klist use /etc/krb5.conf or does it use some other configuration file? I'm worried I did not set up a config file. [gsgatlin@localhost ~]$ /usr/bin/heimdal-kinit gsgatlin gsgatlin@LOCALDOMAIN's Password: heimdal-kinit: krb5_get_init_creds: unable to reach any KDC in realm LOCALDOMAIN Also, going back to the krb5 kinit, how can you specify a FILE: ticket cache type ? Sorry for stupid questions. I have never used heimdal before. Thanks a lot. --001a113fbdc8969f1f056687d3a4 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

= On Sat, Mar 3, 2018 at 12:14 PM, Harald Barth <haba@kth.se> wrote:=

Hm. If I remember correct, at least parts of the kerberos ticket in
the ticket cache are endian dependent. As the principal name seems to
be broken to start with, maybe the error is there. Do you have the
same problems if you use the FILE: ticket cache type or the kinit and
afslog from heimdal to handle tickets and tokens?



Does=C2=A0=C2=A0heimdal-klist use=C2=A0/etc/krb5.conf= or does it use some other configuration file? I'm worried I did not se= t up a config file.

[gsgatlin@localhost ~]$ /= usr/bin/heimdal-kinit gsgatlin
gsgatlin@LOCALDOMAIN's Passwor= d:=C2=A0
heimdal-kinit: krb5_get_init_creds: unable to reach any = KDC in realm LOCALDOMAIN

Also, going back to= the krb5 kinit, how can you specify a=C2=A0FILE: ticket cache type ?=C2=A0=

Sorry for stupid questions. I have never used=C2= =A0heimdal before. Thanks a lot.

--001a113fbdc8969f1f056687d3a4-- From haba@kth.se Sat Mar 3 20:46:59 2018 From: haba@kth.se (Harald Barth) Date: Sat, 03 Mar 2018 21:46:59 +0100 (CET) Subject: [OpenAFS] question about authentication with kerberos and Default principal In-Reply-To: References: <20180303.181450.45876910850177078.haba@habook.pdc.kth.se> Message-ID: <20180303.214659.2017581479824667213.haba@habook.pdc.kth.se> > Does heimdal-klist use /etc/krb5.conf or does it use some other > configuration file? I'm worried I did not set up a config file. It should use /etc/krb5.conf as well unless KRB5_CONFIG is set. You should have something like: [libdefaults] default_realm = YOURDOMAIN in there. > [gsgatlin@localhost ~]$ /usr/bin/heimdal-kinit gsgatlin or use /usr/bin/heimdal-kinit gsgatlin@YOURDOMAIN > Also, going back to the krb5 kinit, how can you specify a FILE: ticket > cache type ? Both MIT kinit and heimdal kinit honor the KRB5CCNAME environment variable which has the form TYPE:location thus a typical way to set your FILE cache is: export KRB5CCNAME=FILE:/tmp/krb5cc_`id -u` Btw: As FILE: is the oldest ticket cache type and the default, any file name will do. For example: export KRB5CCNAME=/tmp/whatever will set it to /tmp/whatever Greetings, Harald. From gsgatlin@ncsu.edu Sat Mar 3 21:52:56 2018 From: gsgatlin@ncsu.edu (Gary Gatling) Date: Sat, 3 Mar 2018 16:52:56 -0500 Subject: [OpenAFS] question about authentication with kerberos and Default principal In-Reply-To: <20180303.214659.2017581479824667213.haba@habook.pdc.kth.se> References: <20180303.181450.45876910850177078.haba@habook.pdc.kth.se> <20180303.214659.2017581479824667213.haba@habook.pdc.kth.se> Message-ID: --f403045ea6869bd7960566891d35 Content-Type: text/plain; charset="UTF-8" On Sat, Mar 3, 2018 at 3:46 PM, Harald Barth wrote: > > > Both MIT kinit and heimdal kinit honor the KRB5CCNAME environment > variable which has the form TYPE:location thus a typical way to set > your FILE cache is: > > export KRB5CCNAME=FILE:/tmp/krb5cc_`id -u` > > Btw: As FILE: is the oldest ticket cache type and the default, any > file name will do. For example: > > export KRB5CCNAME=/tmp/whatever > > will set it to /tmp/whatever > > Huh. Thats pretty weird. Using the KRB5CCNAME it works fine. [gsgatlin@localhost ~]$ export KRB5CCNAME=FILE:/tmp/krb5cc_`id -u` [gsgatlin@localhost ~]$ kinit gsgatlin Password for gsgatlin@EOS.NCSU.EDU: [gsgatlin@localhost ~]$ klist Ticket cache: FILE:/tmp/krb5cc_1000 Default principal: gsgatlin@EOS.NCSU.EDU Valid starting Expires Service principal 03/03/2018 16:40:27 03/04/2018 13:55:27 krbtgt/EOS.NCSU.EDU@EOS.NCSU.EDU renew until 03/10/2018 16:40:22 [gsgatlin@localhost ~]$ aklog -c eos.ncsu.edu -k EOS.NCSU.EDU [gsgatlin@localhost ~]$ aklog -c unity.ncsu.edu -k EOS.NCSU.EDU [gsgatlin@localhost ~]$ aklog -c bp.ncsu.edu -k EOS.NCSU.EDU [gsgatlin@localhost ~]$ tokens Tokens held by the Cache Manager: User's (AFS ID 19149) tokens for afs@bp.ncsu.edu [Expires Mar 4 13:55] User's (AFS ID 19149) tokens for afs@unity.ncsu.edu [Expires Mar 4 13:55] User's (AFS ID 19149) tokens for afs@eos.ncsu.edu [Expires Mar 4 13:55] --End of list-- I couldn't get the heimdal-kinit to work right but this setting the KRB5CCNAME=FILE:/tmp/krb5cc_`id -u` is fine. I can just add that to my auth shell script that does the kinit and the aklogs stuff on ppc64. Its weird that it didn't work without it but I'm happy it even works at all. Thanks you! :) I can edit files in my home directory so I know its working now. Thank you again for your help. --f403045ea6869bd7960566891d35 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

= On Sat, Mar 3, 2018 at 3:46 PM, Harald Barth <haba@kth.se> wrote:=
Both MIT kinit and heimdal kinit honor the KRB5CCNAME environment variable which has the form TYPE:location thus a typical way to set
your FILE cache is:

export KRB5CCNAME=3DFILE:/tmp/krb5cc_`id -u`

Btw: As FILE: is the oldest ticket cache type and the default, any
file name will do. For example:

export KRB5CCNAME=3D/tmp/whatever

will set it to /tmp/whatever


Huh. T= hats pretty weird. Using the=C2=A0KRB5CCNAME it works fine.

<= /div>
=C2=A0[gsgatlin@localhost ~]$ export KRB5CCNAME=3DFILE:/tmp/krb5c= c_`id -u`
[gsgatlin@localhost ~]$ kinit gsgatlin
Passwo= rd for gsgatlin@EOS.NCSU.EDU:= =C2=A0
[gsgatlin@localhost ~]$ klist
Ticket cache: FILE= :/tmp/krb5cc_1000
Default principal: gsgatlin@EOS.NCSU.EDU

Valid start= ing=C2=A0 =C2=A0 =C2=A0 =C2=A0Expires=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 Service principal
03/03/2018 16:40:27=C2=A0 03/04/2018= 13:55:27=C2=A0 krbtgt/EOS.NCS= U.EDU@EOS.NCSU.EDU
re= new until 03/10/2018 16:40:22
[gsgatlin@localhost ~]$ aklog -c eos.ncsu.edu -k EOS.NCSU.EDU
[gsgatlin@localhost ~]$ aklog -c unity.ncsu.edu -k EOS.NCSU.EDU
[gsgatlin@localhost ~]$ aklog -c bp.ncsu.edu -k = EOS.NCSU.EDU
[gsgatlin@localhost ~]$ tokens

Tokens held by the Cache Manager:

User'= ;s (AFS ID 19149) tokens for afs@bp.ncsu= .edu [Expires Mar=C2=A0 4 13:55]
User's (AFS ID 19149) to= kens for afs@unity.ncsu.edu [Expi= res Mar=C2=A0 4 13:55]
User's (AFS ID 19149) tokens for afs@eos.ncsu.edu [Expires Mar=C2=A0 4 13= :55]
=C2=A0 =C2=A0--End of list--

I cou= ldn't get the heimdal-kinit to work right but this setting the=C2=A0KRB= 5CCNAME=3DFILE:/tmp/krb5cc_`id -u` is fine. I can just add that to my auth = shell script that does the kinit and the aklogs stuff on ppc64. Its weird t= hat it didn't work without it but I'm happy it even works at all. T= hanks you! :)

I can edit files in my home directory so I know its working now= .

Than= k you again for your help.
--f403045ea6869bd7960566891d35-- From gsgatlin@ncsu.edu Sun Mar 4 09:31:10 2018 From: gsgatlin@ncsu.edu (Gary Gatling) Date: Sun, 4 Mar 2018 04:31:10 -0500 Subject: [OpenAFS] building openafs on ppc64le architecture on Linux Message-ID: --f403045ea33caf5630056692dea2 Content-Type: text/plain; charset="UTF-8" Hello. I was trying to compile openafs under ppc64le on fedora 27 in a vm I made. The first problem I ran into was that config.guess did not know about ppc64le. So I replaced that one with a newer version of it from http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.guess;hb=HEAD install -p -m 755 %{SOURCE23} ${RPM_BUILD_DIR}/%{name}-%{version}/build-tools/config.guess This got me further. The second problem was that there was no param.ppc64le_linux26.h file. But I found this thread from 2016 and followed the advice given.... https://lists.openafs.org/pipermail/openafs-info/2016-April/041757.html Here are the commands I was using: cp ${RPM_BUILD_DIR}/%{name}-%{version}/src/config/param.ppc64_linux26.h ${RPM_BUILD_DIR}/%{name}-%{version}/src/config/param.ppc64le_linux26.h sed -i 's@AFSBIG_ENDIAN@AFSLITTLE_ENDIAN@' ${RPM_BUILD_DIR}/%{name}-%{version}/src/config/param.ppc64le_linux26.h And that did change the value. And got a bit farther into the build.... cat /home/gsgatlin/redhat/BUILD/openafs-1.6.22.2/src/config/param.ppc64le_linux26.h | grep AFSLITTLE_ENDIAN #define AFSLITTLE_ENDIAN 1 But now the third problem / error I get is: rx_pthread.c:164:97: error: expected expression before ';' token error = CV_TIMEDWAIT(&rx_event_handler_cond, &event_handler_mutex, &rx_pthread_next_event_time); Here is the full output from warnings and errors from a little bit above where it goes south from what I can tell: https://pastebin.com/j9Nm58sy Here are my config options # build the user-space bits for base architectures ./configure \ --prefix=%{_prefix} \ --libdir=%{_libdir} \ --bindir=%{_bindir} \ --sbindir=%{_sbindir} \ --sysconfdir=%{_sysconfdir} \ --localstatedir=%{_var} \ --with-afs-sysname=%{sysname} \ --with-linux-kernel-headers=%{ksource_dir} \ --disable-kernel-module \ --disable-strip-binaries \ --enable-supergroups where %{sysname} is ppc64le_linux26 Anyone know if there could be a way around that error? Perhaps its just too hard a problem? I could make a longer pastebin with all the output from rpmbuild if it would help. Thanks, --f403045ea33caf5630056692dea2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello.

I was trying to compile openafs = under ppc64le on fedora 27 in a vm I made. The first problem I ran into was= that config.guess did not know about ppc64le. So I replaced that one with = a newer version of it from=C2=A0



<= div>install -p -m 755 %{SOURCE23} ${RPM_BUILD_DIR}/%{name}-%{version}/build= -tools/config.guess

This got me further.
=

The second problem was that there was no=C2=A0param.ppc= 64le_linux26.h file. But I found this thread from 2016 and followed the adv= ice given....


Here are the commands I was using:

cp ${RP= M_BUILD_DIR}/%{name}-%{version}/src/config/param.ppc64_linux26.h ${RPM_BUIL= D_DIR}/%{name}-%{version}/src/config/param.ppc64le_linux26.h

=
sed -i 's@AFSBIG_ENDIAN@AFSLITTLE_ENDIAN@' ${RPM_BUILD_D= IR}/%{name}-%{version}/src/config/param.ppc64le_linux26.h
<= br>
And that did change the value. And got a bit farther into the= build....

cat /home/gsgatlin/redhat/BUILD/op= enafs-1.6.22.2/src/config/param.ppc64le_linux26.h | grep AFSLITTLE_ENDIAN
#define AFSLITTLE_ENDIAN 1<= /div>

But now the third problem / error I get is:<= /div>

rx_pthread.c:164:97: error: expected expressi= on before ';' token
=C2=A0 error =3D CV_TIMEDWAIT(&rx= _event_handler_cond, &event_handler_mutex, &rx_pthread_next_event_t= ime);


Here is the full output= from warnings and errors from a little bit above where it goes south from = what I can tell:


Her= e are my config options

# build the user-spac= e bits for base architectures
=C2=A0 =C2=A0 ./configure \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --prefix=3D%{_prefix} \
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 --libdir=3D%{_libdir} \
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 --bindir=3D%{_bindir} \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --sbin= dir=3D%{_sbindir} \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --sysconfdir=3D%{= _sysconfdir} \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --localstatedir=3D%{_v= ar} \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --with-afs-sysname=3D%{sysname}= \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --with-linux-kernel-headers=3D%{ks= ource_dir} \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --disable-kernel-module = \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --disable-strip-binaries \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --enable-supergroups

where %{sysname} is=C2=A0ppc64le_linux26

Anyone know if there could be a way around that error? Perhaps its just t= oo hard a problem? I could make a longer pastebin with all the output from = rpmbuild if it would help.

Thanks,

<= /div>
--f403045ea33caf5630056692dea2-- From kaduk@mit.edu Sun Mar 4 19:57:43 2018 From: kaduk@mit.edu (Benjamin Kaduk) Date: Sun, 4 Mar 2018 13:57:43 -0600 Subject: [OpenAFS] question about authentication with kerberos and Default principal In-Reply-To: References: Message-ID: <20180304195742.GJ50954@kduck.kaduk.org> On Sat, Mar 03, 2018 at 10:15:10AM -0500, Gary Gatling wrote: > Recently I decided to play around with some alternative architectures on > fedora with virt-manager/qemu. So I set up some power machines. (ppc64 and > ppc64le) > I also made some arm machines but I gather openafs isn't quite ready yet > for arm in 1.6.22.2. Just addressing this one fork of the thread... For Aarch64/arm64, that's correct. The older arm architectures should work, though -- Debian has armel and armhf packages available even in stretch, which has not taken the 1.8.x branch. -Ben From kaduk@mit.edu Mon Mar 5 01:05:40 2018 From: kaduk@mit.edu (Benjamin Kaduk) Date: Sun, 4 Mar 2018 19:05:40 -0600 Subject: [OpenAFS] building openafs on ppc64le architecture on Linux In-Reply-To: References: Message-ID: <20180305010539.GK50954@kduck.kaduk.org> On Sun, Mar 04, 2018 at 04:31:10AM -0500, Gary Gatling wrote: > Hello. > > > But now the third problem / error I get is: > > rx_pthread.c:164:97: error: expected expression before ';' token > error = CV_TIMEDWAIT(&rx_event_handler_cond, &event_handler_mutex, > &rx_pthread_next_event_time); > Hmm, it is as if CV_TIMEDWAIT() somehow got #defined away. I see from the pastebin that you are basing your work off 1.6.22; I would recommend starting again from master (or 1.8.0pre5 which is pretty similar), since (1) new code would have to go through master anyway, and (2) master has some changes in this area, using the OpenAFS Portable Runtime (opr) library instead of directly using pthread calls, which may or may not be relevant. -Ben From dirk.heinrichs@altum.de Thu Mar 8 16:39:51 2018 From: dirk.heinrichs@altum.de (Dirk Heinrichs) Date: Thu, 8 Mar 2018 17:39:51 +0100 Subject: [OpenAFS] Linux: systemctl --user vs. AFS Message-ID: <7f6d69d7-859d-722b-74a3-73e23621bca5@altum.de> This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --zPQKnHeQKMO0kUtHzdojHAm6Rjiyv0KNg Content-Type: multipart/mixed; boundary="O6CNZIwhucmZMv8oLPpMRFIXMCAJPtygz"; protected-headers="v1" From: Dirk Heinrichs To: openafs-info Message-ID: <7f6d69d7-859d-722b-74a3-73e23621bca5@altum.de> Subject: Linux: systemctl --user vs. AFS --O6CNZIwhucmZMv8oLPpMRFIXMCAJPtygz Content-Type: multipart/alternative; boundary="------------BA50F1628C25969E69484B81" Content-Language: de-DE This is a multi-part message in MIME format. --------------BA50F1628C25969E69484B81 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi, as some Linux users might already have noticed, there's an incompatibility issue between systemctl --user and users having their $HOME below /afs. Background: systemctl --user is the per-user equivalent of systemctl, which means starting services on behalf of the current user. For this to work, a corresponding systemd --user process is started upon the users first login. However, the problem here is that this process is not started from the users session, but from PID 1, and runs through its own PAM stack (which is non-interactive and therefor doesn't get an AFS token= ). The result is that any systemctl --user command gets a permission denied, for example: % systemctl --user enable syncthing Failed to enable unit: Access denied because the systemd --user process is denied access to the users $HOME. There are discussions about this already in both the Debian and systemd bug trackers (see links below). The outcome of both seems to be that the problem can be solved with a combination of two changes: 1. make sure the PAM stack for systemd --user includes pam_keyinit.so (suggested in the Debian bug discussion) 2. let AFS use the per-user keyring instead of the per-session one (suggested in the systemd bug discussion) Does the second one sound reasonable? Bye... =C2=A0=C2=A0=C2=A0 Dirk 1. Debian bug 2. systemd bug --=20 Dirk Heinrichs GPG Public Key: D01B367761B0F7CE6E6D81AAD5A2E54246986015 Sichere Internetkommunikation: http://www.retroshare.org Privacy Handbuch: https://www.privacy-handbuch.de --------------BA50F1628C25969E69484B81 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi,
as some Linux users might already have noticed, there's an incompatibility issue between systemctl --user and users having their $HOME below /afs.

Background: systemctl --user is the per-user equivalent of systemctl, which means starting services on behalf of the current user. For this to work, a corresponding systemd --user process is started upon the users first login. However, the problem here is that this process is not started from the users session, but from PID 1, and runs through its own PAM stack (which is non-interactive and therefor doesn't get an AFS token).
The result is that any systemctl --user command gets a permission denied, for example:

% systemctl --user enable syncthing

Failed to enable unit: Access denied

because the systemd --user process is denied access to the users $HOME.

There are discussions about this already in both the Debian and systemd bug trackers (see links below).

The outcome of both seems to be that the problem can be solved with a combination of two changes:
  1. make sure the PAM stack for systemd --user includes pam_keyinit.so (suggested in the Debian bug discussion)
  2. let AFS use the per-user keyring instead of the per-session one (suggested in the systemd bug discussion)
Does the second one sound reasonable?

Bye...

=C2=A0=C2=A0=C2=A0 Dirk
  1. Debian bug
  2. systemd bug
--=20
Dirk Heinrichs <dirk.heinrichs@altum.de>
GPG Public Key: D01B367761B0F7CE6E6D81AAD5A2E54246986015
Sichere Internetkommunikation: http://www.retroshare.org
Privacy Handbuch: https://www.privacy-handbuch.de
--------------BA50F1628C25969E69484B81-- --O6CNZIwhucmZMv8oLPpMRFIXMCAJPtygz-- --zPQKnHeQKMO0kUtHzdojHAm6Rjiyv0KNg Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEJgWJ3LIo7zNO9tmf0p7rxfc7RqsFAlqhZ1cACgkQ0p7rxfc7 RqtN8g//QS0mXvOIxLLqZ5PK/sZTwpS+zsYp0X3aM2Ri8pEb+mkivukXPzelHxQR 7W7RzfVo/ZY82SjJAZIAc0lRjA/NqFHosadBKiJclpKxfK0iyUhC2Qxm0SkfQgNe ecwILyRsLj+wmTFer1sP90bi52+mG0WxfYXxXsEMepokCeYxidQ6A/HRT/v0L9bA FVFZ9Ktgw8keShQVn5t5MulwLIl+jJyb/WHK+FqxgFodn1IjmOOIT/BF2WBXcr27 78G+A4/ZBKW8h0ZFVNvgDGGOvGa/5MVDrn8adj0D9gDdEvJWGjuX0l9o4bs0VBIm U4RMWe1N6MwDuU2idbXAHHVz278ukK47a26QlSHu6xMfkFLGlVNgoXfMC2d1mO0p tkQx0dyUEX7t3m8rA9noLv+/D5HRUxmexNr0wiMtoh0B0nnNGnILwNmrloXkLBs+ AznWHB+7jUV+JXthQxsd164JEc+uSrzqpmSpwphq7H/PBMuNfBszwK1oE3MKeI1Q UivDDumKkHL/rqkh1FkK4QLowDVkJoGzsVNKwsWcnYNFJj89lrNDhFjL6h0CurMe bOapRuktMwwZtMfjIhx90KsTty54EDwGhbIrsz4p9pi8MV5DLFS7M1kgADHKlwnt qtwrh9V+SL/Wj+lVX1liO+AoqlvRiZIFHLEhE//YhBpjWJSp3Fc= =uh3y -----END PGP SIGNATURE----- --zPQKnHeQKMO0kUtHzdojHAm6Rjiyv0KNg-- From jaltman@auristor.com Thu Mar 8 17:54:45 2018 From: jaltman@auristor.com (Jeffrey Altman) Date: Thu, 8 Mar 2018 12:54:45 -0500 Subject: [OpenAFS] Linux: systemctl --user vs. AFS In-Reply-To: <7f6d69d7-859d-722b-74a3-73e23621bca5@altum.de> References: <7f6d69d7-859d-722b-74a3-73e23621bca5@altum.de> Message-ID: This is a cryptographically signed message in MIME format. --------------ms080402090405060503060704 Content-Type: multipart/mixed; boundary="------------87B0FB528EF859EF71FE87EB" Content-Language: en-US This is a multi-part message in MIME format. --------------87B0FB528EF859EF71FE87EB Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable > 2. let AFS use the per-user keyring instead of the per-session one > (suggested in the systemd bug discussion) >=20 > Does the second one sound reasonable? Switching to the user keyring is unreasonable. The impact of such a change is that all user sessions on a system share the same tokens and an effective uid change permits access to those same tokens. Process Authentication Groups (PAGs) exist explicitly to establish a security barrier to prevent such credential leakage. Just my two cents ... Jeffrey Altman --------------87B0FB528EF859EF71FE87EB Content-Type: text/x-vcard; charset=utf-8; name="jaltman.vcf" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="jaltman.vcf" begin:vcard fn:Jeffrey Altman n:Altman;Jeffrey org:AuriStor, Inc. adr:Suite 6B;;255 West 94Th Street;New York;New York;10025-6985;United St= ates email;internet:jaltman@auristor.com title:Founder and CEO tel;work:+1-212-769-9018 note;quoted-printable:LinkedIn: https://www.linkedin.com/in/jeffreyaltman= =3D0D=3D0A=3D Skype: jeffrey.e.altman=3D0D=3D0A=3D =09 url:https://www.auristor.com/ version:2.1 end:vcard --------------87B0FB528EF859EF71FE87EB-- --------------ms080402090405060503060704 Content-Type: application/pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwEAAKCC DIIwggXpMIIE0aADAgECAhBAAV7gPRitcrlGsJTzkwjvMA0GCSqGSIb3DQEBCwUAMDoxCzAJ BgNVBAYTAlVTMRIwEAYDVQQKEwlJZGVuVHJ1c3QxFzAVBgNVBAMTDlRydXN0SUQgQ0EgQTEy MB4XDTE3MTAwMzAzMTczM1oXDTE4MTEwMzAzMTczM1owgYUxLTArBgNVBAsMJFZlcmlmaWVk IEVtYWlsOiBqYWx0bWFuQGF1cmlzdG9yLmNvbTEjMCEGCSqGSIb3DQEJARYUamFsdG1hbkBh dXJpc3Rvci5jb20xLzAtBgoJkiaJk/IsZAEBEx9BMDE0MjdFMDAwMDAxNUVFMDNEMTg3QTAw MDA0QUE1MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAqqJC89ZA1DSS7t/Ug8Dd BQv5nBDumInWtFvHwVCORitVCvlkX4SfqKpERATq0eHOSc0zEz1PUjhAT8lgbNj8Bs92pL9t DW/VHHpq11w06rCEmZJNxgErAIvMpRuAhGrzvBpQBLj8nDArHWw+5nRn/KnK7ZO81LEEj4TG w0PEKGSa0aFA+JdRTJ6BZSDP2o/8AHx+Bw4JgW8VppAe4IuY/F+JoYtyQDL+fm1YMnFMtf1A 6IvlGXD7gMksPRbVIfD+QpHZbQvNXZAVVDaCWZuWQq46Vl4lSlkmW9yMlGddvFGl2zSMK7ny f0kbWJLw9lZxXDegY0/ciJPACPsyBwuyLwIDAQABo4ICnTCCApkwDgYDVR0PAQH/BAQDAgWg MIGEBggrBgEFBQcBAQR4MHYwMAYIKwYBBQUHMAGGJGh0dHA6Ly9jb21tZXJjaWFsLm9jc3Au aWRlbnRydXN0LmNvbTBCBggrBgEFBQcwAoY2aHR0cDovL3ZhbGlkYXRpb24uaWRlbnRydXN0 LmNvbS9jZXJ0cy90cnVzdGlkY2FhMTIucDdjMB8GA1UdIwQYMBaAFKRz2u9pNYp1zKAZewgy +GuJ5ELsMAkGA1UdEwQCMAAwggEsBgNVHSAEggEjMIIBHzCCARsGC2CGSAGG+S8ABgsBMIIB CjBKBggrBgEFBQcCARY+aHR0cHM6Ly9zZWN1cmUuaWRlbnRydXN0LmNvbS9jZXJ0aWZpY2F0 ZXMvcG9saWN5L3RzL2luZGV4Lmh0bWwwgbsGCCsGAQUFBwICMIGuGoGrVGhpcyBUcnVzdElE IENlcnRpZmljYXRlIGhhcyBiZWVuIGlzc3VlZCBpbiBhY2NvcmRhbmNlIHdpdGggCklkZW5U cnVzdCdzIFRydXN0SUQgQ2VydGlmaWNhdGUgUG9saWN5IGZvdW5kIGF0IGh0dHBzOi8vc2Vj dXJlLmlkZW50cnVzdC5jb20vY2VydGlmaWNhdGVzL3BvbGljeS90cy9pbmRleC5odG1sMEUG A1UdHwQ+MDwwOqA4oDaGNGh0dHA6Ly92YWxpZGF0aW9uLmlkZW50cnVzdC5jb20vY3JsL3Ry dXN0aWRjYWExMi5jcmwwHwYDVR0RBBgwFoEUamFsdG1hbkBhdXJpc3Rvci5jb20wHQYDVR0O BBYEFNefZrPaqPUvaS6V6kAmHDwFhoDiMB0GA1UdJQQWMBQGCCsGAQUFBwMCBggrBgEFBQcD BDANBgkqhkiG9w0BAQsFAAOCAQEAKlssrfOJ5+WwHyhFSeSsioN0qpg2QDX/uvodF38JbquO 1U0my0j3Cc/bwk48++bjzp0Fvk/Kkcmss5/6zzJMjr9rf12QCQfKkbO9nMm8Bg6IP3pYgk0W /F1h3ZQF3OgBn3zZoOd3f1a6dF6z12MqKA/2g5GKrQFxkdzTGrNw6ISE9uY8ysvc3i2N2kas HNi5Etk7StZ1jvFX5sQMIeNdlF+z+BU/AyT7NoBS4gCH+ggF+DG7fAYywvy42Lfu8p6kopKT 5JZpYce1cNjnOaDhzhgeR+oXxoDbekF27JinXHQSKjBxhujcZu5leAkpctFpZxnIKZJZUBiu 31Nm7xYaijCCBpEwggR5oAMCAQICEQD53lZ/yU0Md3D5YBtS2hU7MA0GCSqGSIb3DQEBCwUA MEoxCzAJBgNVBAYTAlVTMRIwEAYDVQQKEwlJZGVuVHJ1c3QxJzAlBgNVBAMTHklkZW5UcnVz dCBDb21tZXJjaWFsIFJvb3QgQ0EgMTAeFw0xNTAyMTgyMjI1MTlaFw0yMzAyMTgyMjI1MTla MDoxCzAJBgNVBAYTAlVTMRIwEAYDVQQKEwlJZGVuVHJ1c3QxFzAVBgNVBAMTDlRydXN0SUQg Q0EgQTEyMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA0ZFNPM8KJzSSrkvpmtQl a3ksT+fq1s9c+Ea3YSC/umUkygSm9UkkOoaoNjKZoCx3wef1kwC4pQQV2XHk+AKR+7uMvnOC Iw2cAVUP0/Kuy4X6miqaXGGVDTqwVjaFuFCRVVDTQoI2BTMpwFQi+O/TjD5+E0+TAZbkzsB7 krk4YUbA6hFyT0YboxRUq9M2QHDb+80w53b1UZVO1HS2Mfk9LnINeyzjxiXU/iENK07YvjBO xbY/ftAYPbv/9cY3wrpqZYHoXZc6B9/8+aVCNA45FP3k+YuTDC+ZrmePQBLQJWnyS/QrZEdX saieWUqkUMxPQKTExArCiP61YRYlOIMpKwIDAQABo4ICgDCCAnwwgYkGCCsGAQUFBwEBBH0w ezAwBggrBgEFBQcwAYYkaHR0cDovL2NvbW1lcmNpYWwub2NzcC5pZGVudHJ1c3QuY29tMEcG CCsGAQUFBzAChjtodHRwOi8vdmFsaWRhdGlvbi5pZGVudHJ1c3QuY29tL3Jvb3RzL2NvbW1l cmNpYWxyb290Y2ExLnA3YzAfBgNVHSMEGDAWgBTtRBnA0/AGi+6ke75C5yZUyI42djAPBgNV HRMBAf8EBTADAQH/MIIBIAYDVR0gBIIBFzCCARMwggEPBgRVHSAAMIIBBTCCAQEGCCsGAQUF BwICMIH0MEUWPmh0dHBzOi8vc2VjdXJlLmlkZW50cnVzdC5jb20vY2VydGlmaWNhdGVzL3Bv bGljeS90cy9pbmRleC5odG1sMAMCAQEagapUaGlzIFRydXN0SUQgQ2VydGlmaWNhdGUgaGFz IGJlZW4gaXNzdWVkIGluIGFjY29yZGFuY2Ugd2l0aCBJZGVuVHJ1c3QncyBUcnVzdElEIENl cnRpZmljYXRlIFBvbGljeSBmb3VuZCBhdCBodHRwczovL3NlY3VyZS5pZGVudHJ1c3QuY29t L2NlcnRpZmljYXRlcy9wb2xpY3kvdHMvaW5kZXguaHRtbDBKBgNVHR8EQzBBMD+gPaA7hjlo dHRwOi8vdmFsaWRhdGlvbi5pZGVudHJ1c3QuY29tL2NybC9jb21tZXJjaWFscm9vdGNhMS5j cmwwHQYDVR0lBBYwFAYIKwYBBQUHAwIGCCsGAQUFBwMEMA4GA1UdDwEB/wQEAwIBhjAdBgNV HQ4EFgQUpHPa72k1inXMoBl7CDL4a4nkQuwwDQYJKoZIhvcNAQELBQADggIBAA3hgq7S+/Tr Yxl+D7ExI1Rdgq8fC9kiT7ofWlSaK/IMjgjoDfBbPGWvzdkmbSgYgXo8GxuAon9+HLIjNv68 BgUmbIjwj/SYaVz6chA25XZdjxzKk+hUkqCmfOn/twQJeRfxHg3I+0Sfwp5xs10YF0Robhrs CRne6OUmh9mph0fE3b21k90OVnx9Hfr+YAV4ISrTA6045zQTKGzb370whliPLFo+hNL6XzEt y5hfdFaWKtHIfpE994CLmTJI4SEbWq40d7TpAjCmKCPIVPq/+9GqggGvtakM5K3VXNc9VtKP U9xYGCTDIYoeVBQ65JsdsdyM4PzDzAdINsv4vaF7yE03nh2jLV7XAkcqad9vS4EB4hKjFFsm cwxa+ACUfkVWtBaWBqN4f/o1thsFJHEAu4Q6oRB6mYkzqrPigPazF2rgYw3lp0B1gSzCRj+j RtErIVdMPeZ2p5Fdx7SNhBtabuhqmpJkFxwW9SBg6sHvy0HpzVvEiBpApFKG1ZHXMwzQl+pR 8P27wWDsblJU7Qgb8ZzGRK9l5GOFhxtN+oXZ4CCmunLMtaZ2vSai7du/VKrg64GGZNAKerEB evjJVNFgeSnmUK9GB4kCZ7U5NWlU+2H87scntW4Q/0Y6vqQJcJeaMHg/dQnahTQ2p+hB1xJJ K32GWIAucTFMSOKLbQHadIOiMYIDFDCCAxACAQEwTjA6MQswCQYDVQQGEwJVUzESMBAGA1UE ChMJSWRlblRydXN0MRcwFQYDVQQDEw5UcnVzdElEIENBIEExMgIQQAFe4D0YrXK5RrCU85MI 7zANBglghkgBZQMEAgEFAKCCAZcwGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG 9w0BCQUxDxcNMTgwMzA4MTc1NDQ1WjAvBgkqhkiG9w0BCQQxIgQg+ASHuRnlm2CUEMml+aps PnmdckNx5Br7pT90FM0/5j8wXQYJKwYBBAGCNxAEMVAwTjA6MQswCQYDVQQGEwJVUzESMBAG A1UEChMJSWRlblRydXN0MRcwFQYDVQQDEw5UcnVzdElEIENBIEExMgIQQAFe4D0YrXK5RrCU 85MI7zBfBgsqhkiG9w0BCRACCzFQoE4wOjELMAkGA1UEBhMCVVMxEjAQBgNVBAoTCUlkZW5U cnVzdDEXMBUGA1UEAxMOVHJ1c3RJRCBDQSBBMTICEEABXuA9GK1yuUawlPOTCO8wbAYJKoZI hvcNAQkPMV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3DQMHMA4GCCqG SIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDANBgkq hkiG9w0BAQEFAASCAQA+i6cv1xftCsvQv0GbIdj8x9FVoBMD8gBYNgXdJH4iMbr8iSp1CS+L pkpqcVfcbm0ELY6+qCO+n8j8hehSi0XkXiTwmsKmw70XiORVHETUvCl1vCq9V5YLEb9DcDol NEGeUOEfh5/S8zM6QGU4Xe0D5DAeXu43eCEU3b2gHl3WaxfGEDX+sQG5aREjC8VA2aL7mGPj KlwyGC0Y9Ci5nyD4q+34+/1tSjI5vJ1Qf3rsYlePDqL1LGXaUPDp8eK5rROQc3O6hlfow8mp fqVOFeZUSqQULR9a2G70watVaLAkwCxc4etm8UOTRAMeACWAXHo/MycGxCjg3C5lF3hh2IdA AAAAAAAA --------------ms080402090405060503060704-- From jsbillin@umich.edu Thu Mar 8 19:08:17 2018 From: jsbillin@umich.edu (Jonathan Billings) Date: Thu, 8 Mar 2018 14:08:17 -0500 Subject: [OpenAFS] Linux: systemctl --user vs. AFS In-Reply-To: <7f6d69d7-859d-722b-74a3-73e23621bca5@altum.de> References: <7f6d69d7-859d-722b-74a3-73e23621bca5@altum.de> Message-ID: --001a11375848fda2b20566eb651b Content-Type: text/plain; charset="UTF-8" There's a google doc in the Debian bug that I wrote ( https://docs.google.com/document/d/1P27fP1uj-C8QdxDKMKtI-Qh00c5_9zJa4YHjnpB6ODM/pub), which was to create an /etc/systemd/user/aklog.service that is automatically started as part of the login, what it does is runs an aklog so that the processes started by systemd --user have tokens. This assumes that it's got its own keyring. This works, to a certain extent. I also have a startup script that I wrote that runs dbus-monitor to watch org.gnome.ScreenSaver, and restart the aklog.service user service every time you unlock the screensaver, so those tokens get renewed with the updated krb5 credentials. It's all very hacky and is a constant source of pain for me since I use AFS as my $HOME. I feel like a better solution would be to not start systemd --user externally to your login session (and PAG) and instead have it start up as part of the PAM stack, but that isn't systemd-ey enough. On Thu, Mar 8, 2018 at 11:39 AM, Dirk Heinrichs wrote: > Hi, > as some Linux users might already have noticed, there's an incompatibility > issue between systemctl --user and users having their $HOME below /afs. > > Background: systemctl --user is the per-user equivalent of systemctl, > which means starting services on behalf of the current user. For this to > work, a corresponding systemd --user process is started upon the users > first login. However, the problem here is that this process is not started > from the users session, but from PID 1, and runs through its own PAM stack > (which is non-interactive and therefor doesn't get an AFS token). > The result is that any systemctl --user command gets a permission denied, > for example: > > % systemctl --user enable syncthing > Failed to enable unit: Access denied > > because the systemd --user process is denied access to the users $HOME. > > There are discussions about this already in both the Debian and systemd > bug trackers (see links below). > > The outcome of both seems to be that the problem can be solved with a > combination of two changes: > > 1. make sure the PAM stack for systemd --user includes pam_keyinit.so > (suggested in the Debian bug discussion) > 2. let AFS use the per-user keyring instead of the per-session one > (suggested in the systemd bug discussion) > > Does the second one sound reasonable? > > Bye... > > Dirk > > 1. Debian bug > > 2. systemd bug > > > -- > Dirk Heinrichs > GPG Public Key: D01B367761B0F7CE6E6D81AAD5A2E54246986015 > Sichere Internetkommunikation: http://www.retroshare.org > Privacy Handbuch: https://www.privacy-handbuch.de > > -- Jonathan Billings College of Engineering - CAEN - Unix and Linux Support --001a11375848fda2b20566eb651b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
There's a google doc in the Debian bug that = I wrote (https://docs.google.com/document/d/1P27fP1uj-= C8QdxDKMKtI-Qh00c5_9zJa4YHjnpB6ODM/pub), which was to create an /etc/sy= stemd/user/aklog.service that is automatically started as part of the login= , what it does is runs an aklog so that the processes started by systemd --= user have tokens.=C2=A0 This assumes that it's got its own keyring.
=
This works, to a certain extent.=C2=A0 I also have a startup scri= pt that I wrote that runs dbus-monitor to watch org.gnome.ScreenSaver, and = restart the aklog.service user service every time you unlock the screensave= r, so those tokens get renewed with the updated krb5 credentials.

It's all very hacky and is a constant source of pain for me since I= use AFS as my $HOME.=C2=A0=C2=A0 I feel like a better solution would be to= not start systemd --user externally to your login session (and PAG) and in= stead have it start up as part of the PAM stack, but that isn't systemd= -ey enough.

On Thu, Mar 8, 2018 at 11:39 AM, Dirk Heinrichs <<= a href=3D"mailto:dirk.heinrichs@altum.de" target=3D"_blank">dirk.heinrichs@= altum.de> wrote:
=20 =20 =20
Hi,
as some Linux users might already have noticed, there's an incompatibility issue between systemctl --user and users having their $HOME below /afs.

Background: systemctl --user is the per-user equivalent of systemctl, which means starting services on behalf of the current user. For this to work, a corresponding systemd --user process is started upon the users first login. However, the problem here is that this process is not started from the users session, but from PID 1, and runs through its own PAM stack (which is non-interactive and therefor doesn't get an AFS token).
The result is that any systemctl --user command gets a permission denied, for example:

% systemctl --user enable syncthing

Failed to enable unit: Access denied

because the systemd --user process is denied access to the users $HOME.

There are discussions about this already in both the Debian and systemd bug trackers (see links below).

The outcome of both seems to be that the problem can be solved with a combination of two changes:
  1. make sure the PAM stack for systemd --user includes pam_keyinit.so (suggested in the Debian bug discussion)
  2. let AFS use the per-user keyring instead of the per-session one (suggested in the systemd bug discussion)
Does the second one sound reasonable?

Bye...

=C2=A0=C2=A0=C2=A0 Dirk
  1. Debian bug
  2. systemd bug
--=20
Dirk Heinrichs <dirk.heinrichs@al=
tum.de>
GPG Public Key: D01B367761B0F7CE6E6D81AAD5A2E54246986015
Sichere Internetkommunikation: http://www=
.retroshare.org
Privacy Handbuch: https://www.priva=
cy-handbuch.de



--
Jonathan Billings <jsbillin@umich.edu>College of Engineering - CAEN - Unix and Linux Support

--001a11375848fda2b20566eb651b-- From dirk.heinrichs@altum.de Fri Mar 9 16:06:20 2018 From: dirk.heinrichs@altum.de (Dirk Heinrichs) Date: Fri, 9 Mar 2018 17:06:20 +0100 Subject: [OpenAFS] Linux: systemctl --user vs. AFS In-Reply-To: References: <7f6d69d7-859d-722b-74a3-73e23621bca5@altum.de> Message-ID: This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --Ibl1Zcf44lbnVQlKE7Ruap09tN9NJJBj0 Content-Type: multipart/mixed; boundary="hYJBjJa02mwMWuY57MWcvhlFgEUhFxikt"; protected-headers="v1" From: Dirk Heinrichs To: openafs-info@openafs.org Message-ID: Subject: Re: [OpenAFS] Linux: systemctl --user vs. AFS References: <7f6d69d7-859d-722b-74a3-73e23621bca5@altum.de> In-Reply-To: --hYJBjJa02mwMWuY57MWcvhlFgEUhFxikt Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-Language: de-DE Am 08.03.2018 um 18:54 schrieb Jeffrey Altman: >> 2. let AFS use the per-user keyring instead of the per-session one >> (suggested in the systemd bug discussion) >> >> Does the second one sound reasonable? > Switching to the user keyring is unreasonable. The impact of such a > change is that all user sessions on a system share the same tokens and > an effective uid change permits access to those same tokens. > > Process Authentication Groups (PAGs) exist explicitly to establish a > security barrier to prevent such credential leakage. I understand. However, why not let the user (or better: admin) decide? I assume this is coded in the cache manager, so the module could be enhanced with a parameter that allows to choose between the two variants at module load time. The current behaviour of using the session keyring could still be the default. Adding my own two cents... Bye... =C2=A0=C2=A0=C2=A0 Dirk --=20 Dirk Heinrichs GPG Public Key: D01B367761B0F7CE6E6D81AAD5A2E54246986015 Sichere Internetkommunikation: http://www.retroshare.org Privacy Handbuch: https://www.privacy-handbuch.de --hYJBjJa02mwMWuY57MWcvhlFgEUhFxikt-- --Ibl1Zcf44lbnVQlKE7Ruap09tN9NJJBj0 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEEJgWJ3LIo7zNO9tmf0p7rxfc7RqsFAlqisPwACgkQ0p7rxfc7 RqtcMhAAp63Nore76QtSSirX8Shk3J50IuHcRh7XDHv+s61WwSxKfhbhCO0/9BFe c0f66hQIybDKw5+8eUa2x0oRuVimWbuxqExkYAawlzoeHm2uPxKwOPTx128k85ig YwvYrWROXF/oNDnnanHXgadTEeazSmcDBmsd5zS/F8sCQgvoRuOUZUM6S3JEKrO4 mCjnuYmu6bkhnIJxEoruoAtpBxL0+Yu80rbn7WUMjbkXAEQidhnFrzfdzFXiUWss zWoUKKE9pIruQ408E43QW+X5SYQ3jhjf+T2jZFrQ0soCXokDhKeaGMfUQayfKugR r4Ay0NrnIoFdM6K2jTGd8ED58ml5YIquzm/jrkvm70vtWVql714HeWNnkomdw58B 1+9clvPR1hmiv5zE7yhxyfUKsGMUcBInyeUSmlvTPP8uFJPkGRR0vTP9gdCDIRnJ NdAtHaTz3Fc5b+RmdSKeOkEebkaihUe1brqmTCTD3lYVWPFb6kQoaT3OHBPflRcx 3TFmFdAWdrVP5r9bteHx4VGHotcZa6XTPFsFMV7ln30s4uHkI10lP3f4QdO7EPQu lQH7tg8zOdfJ0HpXszSubdJ4tbtaJ9DgNQquy2Xc04xtKDaPS2agXfQrH3P/te2O 612G78Fw2gwQdxxV8Esp4FpQMF1hFZbwPwZ0NdAVqUGRFVO3a60= =JHTz -----END PGP SIGNATURE----- --Ibl1Zcf44lbnVQlKE7Ruap09tN9NJJBj0-- From jaltman@auristor.com Fri Mar 9 17:05:34 2018 From: jaltman@auristor.com (Jeffrey Altman) Date: Fri, 9 Mar 2018 12:05:34 -0500 Subject: [OpenAFS] Linux: systemctl --user vs. AFS In-Reply-To: References: <7f6d69d7-859d-722b-74a3-73e23621bca5@altum.de> Message-ID: <88e5dc94-bc67-c5ed-7f33-525cd06a9148@auristor.com> This is a cryptographically signed message in MIME format. --------------ms020900020901020707030200 Content-Type: multipart/mixed; boundary="------------6FA2C2DE99AAB986120D4C0D" Content-Language: en-US This is a multi-part message in MIME format. --------------6FA2C2DE99AAB986120D4C0D Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 3/9/2018 11:06 AM, Dirk Heinrichs wrote: > Am 08.03.2018 um 18:54 schrieb Jeffrey Altman: >>> 2. let AFS use the per-user keyring instead of the per-session one >>> (suggested in the systemd bug discussion) >>> >>> Does the second one sound reasonable? >> Switching to the user keyring is unreasonable. The impact of such a >> change is that all user sessions on a system share the same tokens and= >> an effective uid change permits access to those same tokens. >> >> Process Authentication Groups (PAGs) exist explicitly to establish a >> security barrier to prevent such credential leakage. >=20 > I understand. However, why not let the user (or better: admin) decide? = I > assume this is coded in the cache manager, so the module could be > enhanced with a parameter that allows to choose between the two variant= s > at module load time. The current behaviour of using the session keyring= > could still be the default. It is already up the administrator. The choice of whether or not to use PAGs is a decision made by the tooling that acquires tokens. If PAGs are not used, the tokens are bound to the uid. Making the choice to not use PAGs means that there is a serious security vulnerability. Jeffrey Altman --------------6FA2C2DE99AAB986120D4C0D Content-Type: text/x-vcard; charset=utf-8; name="jaltman.vcf" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="jaltman.vcf" begin:vcard fn:Jeffrey Altman n:Altman;Jeffrey org:AuriStor, Inc. adr:Suite 6B;;255 West 94Th Street;New York;New York;10025-6985;United St= ates email;internet:jaltman@auristor.com title:Founder and CEO tel;work:+1-212-769-9018 note;quoted-printable:LinkedIn: https://www.linkedin.com/in/jeffreyaltman= =3D0D=3D0A=3D Skype: jeffrey.e.altman=3D0D=3D0A=3D =09 url:https://www.auristor.com/ version:2.1 end:vcard --------------6FA2C2DE99AAB986120D4C0D-- --------------ms020900020901020707030200 Content-Type: application/pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwEAAKCC DIIwggXpMIIE0aADAgECAhBAAV7gPRitcrlGsJTzkwjvMA0GCSqGSIb3DQEBCwUAMDoxCzAJ BgNVBAYTAlVTMRIwEAYDVQQKEwlJZGVuVHJ1c3QxFzAVBgNVBAMTDlRydXN0SUQgQ0EgQTEy MB4XDTE3MTAwMzAzMTczM1oXDTE4MTEwMzAzMTczM1owgYUxLTArBgNVBAsMJFZlcmlmaWVk IEVtYWlsOiBqYWx0bWFuQGF1cmlzdG9yLmNvbTEjMCEGCSqGSIb3DQEJARYUamFsdG1hbkBh dXJpc3Rvci5jb20xLzAtBgoJkiaJk/IsZAEBEx9BMDE0MjdFMDAwMDAxNUVFMDNEMTg3QTAw MDA0QUE1MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAqqJC89ZA1DSS7t/Ug8Dd BQv5nBDumInWtFvHwVCORitVCvlkX4SfqKpERATq0eHOSc0zEz1PUjhAT8lgbNj8Bs92pL9t DW/VHHpq11w06rCEmZJNxgErAIvMpRuAhGrzvBpQBLj8nDArHWw+5nRn/KnK7ZO81LEEj4TG w0PEKGSa0aFA+JdRTJ6BZSDP2o/8AHx+Bw4JgW8VppAe4IuY/F+JoYtyQDL+fm1YMnFMtf1A 6IvlGXD7gMksPRbVIfD+QpHZbQvNXZAVVDaCWZuWQq46Vl4lSlkmW9yMlGddvFGl2zSMK7ny f0kbWJLw9lZxXDegY0/ciJPACPsyBwuyLwIDAQABo4ICnTCCApkwDgYDVR0PAQH/BAQDAgWg MIGEBggrBgEFBQcBAQR4MHYwMAYIKwYBBQUHMAGGJGh0dHA6Ly9jb21tZXJjaWFsLm9jc3Au aWRlbnRydXN0LmNvbTBCBggrBgEFBQcwAoY2aHR0cDovL3ZhbGlkYXRpb24uaWRlbnRydXN0 LmNvbS9jZXJ0cy90cnVzdGlkY2FhMTIucDdjMB8GA1UdIwQYMBaAFKRz2u9pNYp1zKAZewgy +GuJ5ELsMAkGA1UdEwQCMAAwggEsBgNVHSAEggEjMIIBHzCCARsGC2CGSAGG+S8ABgsBMIIB CjBKBggrBgEFBQcCARY+aHR0cHM6Ly9zZWN1cmUuaWRlbnRydXN0LmNvbS9jZXJ0aWZpY2F0 ZXMvcG9saWN5L3RzL2luZGV4Lmh0bWwwgbsGCCsGAQUFBwICMIGuGoGrVGhpcyBUcnVzdElE IENlcnRpZmljYXRlIGhhcyBiZWVuIGlzc3VlZCBpbiBhY2NvcmRhbmNlIHdpdGggCklkZW5U cnVzdCdzIFRydXN0SUQgQ2VydGlmaWNhdGUgUG9saWN5IGZvdW5kIGF0IGh0dHBzOi8vc2Vj dXJlLmlkZW50cnVzdC5jb20vY2VydGlmaWNhdGVzL3BvbGljeS90cy9pbmRleC5odG1sMEUG A1UdHwQ+MDwwOqA4oDaGNGh0dHA6Ly92YWxpZGF0aW9uLmlkZW50cnVzdC5jb20vY3JsL3Ry dXN0aWRjYWExMi5jcmwwHwYDVR0RBBgwFoEUamFsdG1hbkBhdXJpc3Rvci5jb20wHQYDVR0O BBYEFNefZrPaqPUvaS6V6kAmHDwFhoDiMB0GA1UdJQQWMBQGCCsGAQUFBwMCBggrBgEFBQcD BDANBgkqhkiG9w0BAQsFAAOCAQEAKlssrfOJ5+WwHyhFSeSsioN0qpg2QDX/uvodF38JbquO 1U0my0j3Cc/bwk48++bjzp0Fvk/Kkcmss5/6zzJMjr9rf12QCQfKkbO9nMm8Bg6IP3pYgk0W /F1h3ZQF3OgBn3zZoOd3f1a6dF6z12MqKA/2g5GKrQFxkdzTGrNw6ISE9uY8ysvc3i2N2kas HNi5Etk7StZ1jvFX5sQMIeNdlF+z+BU/AyT7NoBS4gCH+ggF+DG7fAYywvy42Lfu8p6kopKT 5JZpYce1cNjnOaDhzhgeR+oXxoDbekF27JinXHQSKjBxhujcZu5leAkpctFpZxnIKZJZUBiu 31Nm7xYaijCCBpEwggR5oAMCAQICEQD53lZ/yU0Md3D5YBtS2hU7MA0GCSqGSIb3DQEBCwUA MEoxCzAJBgNVBAYTAlVTMRIwEAYDVQQKEwlJZGVuVHJ1c3QxJzAlBgNVBAMTHklkZW5UcnVz dCBDb21tZXJjaWFsIFJvb3QgQ0EgMTAeFw0xNTAyMTgyMjI1MTlaFw0yMzAyMTgyMjI1MTla MDoxCzAJBgNVBAYTAlVTMRIwEAYDVQQKEwlJZGVuVHJ1c3QxFzAVBgNVBAMTDlRydXN0SUQg Q0EgQTEyMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA0ZFNPM8KJzSSrkvpmtQl a3ksT+fq1s9c+Ea3YSC/umUkygSm9UkkOoaoNjKZoCx3wef1kwC4pQQV2XHk+AKR+7uMvnOC Iw2cAVUP0/Kuy4X6miqaXGGVDTqwVjaFuFCRVVDTQoI2BTMpwFQi+O/TjD5+E0+TAZbkzsB7 krk4YUbA6hFyT0YboxRUq9M2QHDb+80w53b1UZVO1HS2Mfk9LnINeyzjxiXU/iENK07YvjBO xbY/ftAYPbv/9cY3wrpqZYHoXZc6B9/8+aVCNA45FP3k+YuTDC+ZrmePQBLQJWnyS/QrZEdX saieWUqkUMxPQKTExArCiP61YRYlOIMpKwIDAQABo4ICgDCCAnwwgYkGCCsGAQUFBwEBBH0w ezAwBggrBgEFBQcwAYYkaHR0cDovL2NvbW1lcmNpYWwub2NzcC5pZGVudHJ1c3QuY29tMEcG CCsGAQUFBzAChjtodHRwOi8vdmFsaWRhdGlvbi5pZGVudHJ1c3QuY29tL3Jvb3RzL2NvbW1l cmNpYWxyb290Y2ExLnA3YzAfBgNVHSMEGDAWgBTtRBnA0/AGi+6ke75C5yZUyI42djAPBgNV HRMBAf8EBTADAQH/MIIBIAYDVR0gBIIBFzCCARMwggEPBgRVHSAAMIIBBTCCAQEGCCsGAQUF BwICMIH0MEUWPmh0dHBzOi8vc2VjdXJlLmlkZW50cnVzdC5jb20vY2VydGlmaWNhdGVzL3Bv bGljeS90cy9pbmRleC5odG1sMAMCAQEagapUaGlzIFRydXN0SUQgQ2VydGlmaWNhdGUgaGFz IGJlZW4gaXNzdWVkIGluIGFjY29yZGFuY2Ugd2l0aCBJZGVuVHJ1c3QncyBUcnVzdElEIENl cnRpZmljYXRlIFBvbGljeSBmb3VuZCBhdCBodHRwczovL3NlY3VyZS5pZGVudHJ1c3QuY29t L2NlcnRpZmljYXRlcy9wb2xpY3kvdHMvaW5kZXguaHRtbDBKBgNVHR8EQzBBMD+gPaA7hjlo dHRwOi8vdmFsaWRhdGlvbi5pZGVudHJ1c3QuY29tL2NybC9jb21tZXJjaWFscm9vdGNhMS5j cmwwHQYDVR0lBBYwFAYIKwYBBQUHAwIGCCsGAQUFBwMEMA4GA1UdDwEB/wQEAwIBhjAdBgNV HQ4EFgQUpHPa72k1inXMoBl7CDL4a4nkQuwwDQYJKoZIhvcNAQELBQADggIBAA3hgq7S+/Tr Yxl+D7ExI1Rdgq8fC9kiT7ofWlSaK/IMjgjoDfBbPGWvzdkmbSgYgXo8GxuAon9+HLIjNv68 BgUmbIjwj/SYaVz6chA25XZdjxzKk+hUkqCmfOn/twQJeRfxHg3I+0Sfwp5xs10YF0Robhrs CRne6OUmh9mph0fE3b21k90OVnx9Hfr+YAV4ISrTA6045zQTKGzb370whliPLFo+hNL6XzEt y5hfdFaWKtHIfpE994CLmTJI4SEbWq40d7TpAjCmKCPIVPq/+9GqggGvtakM5K3VXNc9VtKP U9xYGCTDIYoeVBQ65JsdsdyM4PzDzAdINsv4vaF7yE03nh2jLV7XAkcqad9vS4EB4hKjFFsm cwxa+ACUfkVWtBaWBqN4f/o1thsFJHEAu4Q6oRB6mYkzqrPigPazF2rgYw3lp0B1gSzCRj+j RtErIVdMPeZ2p5Fdx7SNhBtabuhqmpJkFxwW9SBg6sHvy0HpzVvEiBpApFKG1ZHXMwzQl+pR 8P27wWDsblJU7Qgb8ZzGRK9l5GOFhxtN+oXZ4CCmunLMtaZ2vSai7du/VKrg64GGZNAKerEB evjJVNFgeSnmUK9GB4kCZ7U5NWlU+2H87scntW4Q/0Y6vqQJcJeaMHg/dQnahTQ2p+hB1xJJ K32GWIAucTFMSOKLbQHadIOiMYIDFDCCAxACAQEwTjA6MQswCQYDVQQGEwJVUzESMBAGA1UE ChMJSWRlblRydXN0MRcwFQYDVQQDEw5UcnVzdElEIENBIEExMgIQQAFe4D0YrXK5RrCU85MI 7zANBglghkgBZQMEAgEFAKCCAZcwGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG 9w0BCQUxDxcNMTgwMzA5MTcwNTM0WjAvBgkqhkiG9w0BCQQxIgQgJe7nNoxzGT9ICw+Cv4Iy ePyvFDWCIm50D6PWlP7R7xIwXQYJKwYBBAGCNxAEMVAwTjA6MQswCQYDVQQGEwJVUzESMBAG A1UEChMJSWRlblRydXN0MRcwFQYDVQQDEw5UcnVzdElEIENBIEExMgIQQAFe4D0YrXK5RrCU 85MI7zBfBgsqhkiG9w0BCRACCzFQoE4wOjELMAkGA1UEBhMCVVMxEjAQBgNVBAoTCUlkZW5U cnVzdDEXMBUGA1UEAxMOVHJ1c3RJRCBDQSBBMTICEEABXuA9GK1yuUawlPOTCO8wbAYJKoZI hvcNAQkPMV8wXTALBglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3DQMHMA4GCCqG SIb3DQMCAgIAgDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDANBgkq hkiG9w0BAQEFAASCAQCDP7sAonKSscxMVxhNF0gXK8It8F+noYSRT2CyAv9TynhcULQVV4uD uSERsEP7xFvrJPAgPfXuGdtwEUeUuZGeT5DmvSy1DPTDDyXqhaD6P/JK+CF1yCu3fkOO4lHy 1W7CB6BKunnPLFq7muG6L+Gzfm6OF0aPA0hWe6ksRtusrGyCV35bHJq3E1Hild1CPK4K90D/ rg+8LxJwKWxBVaAps2kWU6TB3f9eBDBsSBRxq1pS4GjmJGe1Gph3aZhSJtplF07O60Z1rIsy xJl2TFg51ftNGmWddsX3IurV1HxRWcCFTDLGVSOaws3O290EZNetO0SB2bSGmImdG/2aNMZ+ AAAAAAAA --------------ms020900020901020707030200-- From drosih@rpi.edu Fri Mar 9 19:24:46 2018 From: drosih@rpi.edu (Garance A Drosehn) Date: Fri, 09 Mar 2018 14:24:46 -0500 Subject: [OpenAFS] Linux: systemctl --user vs. AFS In-Reply-To: References: <7f6d69d7-859d-722b-74a3-73e23621bca5@altum.de> Message-ID: <95329B71-0E2D-4934-B4C8-53CC194EC5BE@rpi.edu> On 9 Mar 2018, at 11:06, Dirk Heinrichs wrote: > Am 08.03.2018 um 18:54 schrieb Jeffrey Altman: >> Switching to the user keyring is unreasonable. The impact of such >> a change is that all user sessions on a system share the same tokens >> and an effective uid change permits access to those same tokens. >> >> Process Authentication Groups (PAGs) exist explicitly to establish a >> security barrier to prevent such credential leakage. > > I understand. However, why not let the user (or better: admin) decide? > I assume this is coded in the cache manager, so the module could be > enhanced with a parameter that allows to choose between the two variants > at module load time. Chances are very good that most administrators won't really understand the security issues. Or maybe THEY will understand, but their users will not. And then the users will get into weird problems with no understanding of what is causing the problem. So let's say someone has multiple services running under their userid's keyring. One of them does a klog to a different AFS identity (because it needs to), or it does an unlog. My guess is that that change will immediately happen to all the other services, and that the other services might fail in very weird ways. Or say one of those services has to do a 'sudo' (such as to copy files from AFS space into local filesystems), and immediately loses access to the very file(s) that it needs to copy. Note: when I was first trying to figure out PAM on linux I did not have things setup right, and therefore I would get uid-based auth instead of PAG-based auth. It "mostly works okay", but I kept hitting irritating edge cases where things go wrong and it takes a few minutes to realize why. And usually this happens in the middle of trying to fix something *else* which has gone wrong, and you *really* don't want any additional headaches! -- Garance Alistair Drosehn = drosih@rpi.edu Senior Systems Programmer or gad@FreeBSD.org Rensselaer Polytechnic Institute; Troy, NY; USA From jsbillin@umich.edu Fri Mar 9 20:00:40 2018 From: jsbillin@umich.edu (Jonathan Billings) Date: Fri, 9 Mar 2018 15:00:40 -0500 Subject: [OpenAFS] Linux: systemctl --user vs. AFS In-Reply-To: <95329B71-0E2D-4934-B4C8-53CC194EC5BE@rpi.edu> References: <7f6d69d7-859d-722b-74a3-73e23621bca5@altum.de> <95329B71-0E2D-4934-B4C8-53CC194EC5BE@rpi.edu> Message-ID: --94eb2c0ca7c42bd0960567003fd1 Content-Type: text/plain; charset="UTF-8" On Fri, Mar 9, 2018 at 2:24 PM, Garance A Drosehn wrote: > Chances are very good that most administrators won't really understand > the security issues. Or maybe THEY will understand, but their users > will not. And then the users will get into weird problems with no > understanding of what is causing the problem. > Heck, the systemd maintainers don't understand the security issues. The "how does crond work then" question is common. Sad thing is, using NFSv4 with krb5 security suffers from the same problem, is in the Linux kernel and supported by most distros, and yet breaks in mostly the same way. -- Jonathan Billings College of Engineering - CAEN - Unix and Linux Support --94eb2c0ca7c42bd0960567003fd1 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Fri, Mar 9, 2018 at 2:24 PM, Garance A Drosehn <dros= ih@rpi.edu> wrote:=C2=A0
Chances are very good that most administrators won't= really understand
the security issues.=C2=A0 Or maybe THEY will understand, but their users will not.=C2=A0 And then the users will get into weird problems with no
understanding of what is causing the problem.

Heck, the systemd mainta= iners don't understand the security issues.=C2=A0 The "how does cr= ond work then" question is common.=C2=A0 Sad thing is, using NFSv4 wit= h krb5 security suffers from the same problem, is in the Linux kernel and s= upported by most distros, and yet breaks in mostly the same way.

--
Jonathan Billings <jsbillin@umich.edu>=
College of Engineering - CAEN - Unix and Linux Support

--94eb2c0ca7c42bd0960567003fd1-- From gsgatlin@ncsu.edu Mon Mar 12 01:35:48 2018 From: gsgatlin@ncsu.edu (Gary Gatling) Date: Sun, 11 Mar 2018 20:35:48 -0400 Subject: [OpenAFS] building openafs on ppc64le architecture on Linux In-Reply-To: <20180305010539.GK50954@kduck.kaduk.org> References: <20180305010539.GK50954@kduck.kaduk.org> Message-ID: --000000000000cbd07705672c52fb Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable > > > Hmm, it is as if CV_TIMEDWAIT() somehow got #defined away. > > I see from the pastebin that you are basing your work off 1.6.22; I > would recommend starting again from master (or 1.8.0pre5 which is > pretty similar), since (1) new code would have to go through master > anyway, and (2) master has some changes in this area, using the > OpenAFS Portable Runtime (opr) library instead of directly using > pthread calls, which may or may not be relevant. > > -Ben > Sorry this email is kind of long. I updated some rpms based on openafs-1.8.0pre5. These rpms I am using were worked on by Jack Neely and Ken Dreyer before they both stopped working on openafs when they got new jobs a few years back.. These rpms used to live in the popular fedora third party yum repository "rpmfusion" since for legal reasons they could not be included in fedora proper. But since they have been abandoned by both authors and were dropped from rpmfusion I have been updating them locally for our users in the college of engineering at NCSU for a few years now. We have about 400 - 800 Linux machines in the college that depend on afs filesystems of some sort. maybe 200 of those use ubuntu 16.04 so we can use openafs debian packages (for now) on the ubuntu machines. (since they do not have the same legal restrictions like red hat does in debian) rpms using 1.8.0pre5 worked in a fedora 27 x86_64 virtual machine earlier today but it seemed kind of slow compared to 1.6.22.2. I know "seemed kind of slow" is subjective. I was able to edit files in my $HOME directory so it was working. 1.8.0pre5 compiled and installed on ppc64 emulator. (qemu/virt-manager running on a x86_64 laptop) but it won't start. The error I get is: [root@localhost ~]# systemctl start openafs-client Job for openafs-client.service failed because a timeout was exceeded. See "systemctl status openafs-client.service" and "journalctl -xe" for details There is some kind of error on the console. Here is what it looks like: https://i.imgur.com/KknZ60O.png The error message on the console pops up as soon as you start the service. [root@localhost ~]# systemctl status openafs-client.service =E2=97=8F openafs-client.service - OpenAFS Client Service Loaded: loaded (/usr/lib/systemd/system/openafs-client.service; disabled; vendor preset: disabled) Active: failed (Result: timeout) since Sun 2018-03-11 19:32:42 EDT; 16s ago Process: 1279 ExecStartPre=3D/sbin/modprobe openafs (code=3Dexited, status=3D0/SUCCESS) Process: 1277 ExecStartPre=3D/bin/bash -c fs sysname > /dev/null 2>/dev/null; test $? -ne 0 || (echo AFS client appears to be run Tasks: 6 (limit: 4915) CGroup: /system.slice/openafs-client.service =E2=94=9C=E2=94=801281 /usr/sbin/afsd -afsdb -dynroot -fakestat = -memcache -blocks 102400 -daemons 3 -confdir /etc/openafs =E2=94=9C=E2=94=801285 /usr/sbin/afsd -afsdb -dynroot -fakestat = -memcache -blocks 102400 -daemons 3 -confdir /etc/openafs =E2=94=9C=E2=94=801286 /usr/sbin/afsd -afsdb -dynroot -fakestat = -memcache -blocks 102400 -daemons 3 -confdir /etc/openafs =E2=94=94=E2=94=801288 /usr/sbin/afsd -afsdb -dynroot -fakestat = -memcache -blocks 102400 -daemons 3 -confdir /etc/openafs Mar 11 19:31:11 localhost.localdomain systemd[1]: Starting OpenAFS Client Service... Mar 11 19:32:42 localhost.localdomain systemd[1]: openafs-client.service: Start operation timed out. Terminating. Mar 11 19:32:42 localhost.localdomain systemd[1]: Failed to start OpenAFS Client Service. Mar 11 19:32:42 localhost.localdomain systemd[1]: openafs-client.service: Unit entered failed state. Mar 11 19:32:42 localhost.localdomain systemd[1]: openafs-client.service: Failed with result 'timeout'. [root@localhost ~]# journalctl -xe Mar 11 19:31:20 localhost.localdomain audit[1]: SERVICE_START pid=3D1 uid= =3D0 auid=3D4294967295 ses=3D4294967295 subj=3Dsystem_u:system_r:init_t:s0 msg=3D'unit=3Dpolkit comm=3D"systemd Mar 11 19:31:20 localhost.localdomain dbus-daemon[681]: [system] Successfully activated service 'org.freedesktop.problems' Mar 11 19:31:21 localhost.localdomain abrt-notification[1311]: System encountered a non-fatal error in rxi_ReapConnections() -- Subject: ABRT has detected a non-fatal system error -- Defined-By: ABRT -- Support: https://bugzilla.redhat.com/ -- Documentation: man:abrt(1) --=20 -- Unable to handle kernel paging request for data at address 0x00000000 [libafs] --=20 -- Use the abrt command-line tool for further analysis or to report -- the problem to the appropriate support site. Mar 11 19:32:42 localhost.localdomain systemd[1]: openafs-client.service: Start operation timed out. Terminating. Mar 11 19:32:42 localhost.localdomain audit[1]: SERVICE_START pid=3D1 uid= =3D0 auid=3D4294967295 ses=3D4294967295 subj=3Dsystem_u:system_r:init_t:s0 msg=3D'unit=3Dopenafs-client comm=3D Mar 11 19:32:42 localhost.localdomain systemd[1]: Failed to start OpenAFS Client Service. -- Subject: Unit openafs-client.service has failed -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel --=20 -- Unit openafs-client.service has failed. --=20 -- The result is failed. Mar 11 19:32:42 localhost.localdomain systemd[1]: openafs-client.service: Unit entered failed state. Mar 11 19:32:42 localhost.localdomain systemd[1]: openafs-client.service: Failed with result 'timeout'. I am using dkms to handle compiling the kernel module part of openafs. It worked with 1.6.22.2 on ppc64 arch so I'm not sure what could have changed in 1.8.0pre5 with respect to ppc64 architecture. Maybe it needs a different config options for ppc64? Or maybe I need to try it on a faster pc since I running a emulator. I am unsure. maybe its just too hard of a problem to solve. Here are the configure flags I am using in the openafs spec file: # build the user-space bits for base architectures ./configure \ --prefix=3D%{_prefix} \ --libdir=3D%{_libdir} \ --bindir=3D%{_bindir} \ --sbindir=3D%{_sbindir} \ --sysconfdir=3D%{_sysconfdir} \ --localstatedir=3D%{_var} \ --with-afs-sysname=3D%{sysname} \ --with-linux-kernel-headers=3D%{ksource_dir} \ --disable-kernel-module \ --disable-strip-binaries \ --enable-supergroupsb \ %if %{enable_kauth} --enable-kauth \ %endif --enable-debug I had to add " --enable-debug" to make the debug packages build in fedora 27. The files in the regular packages (non debug ones) get striped. In all compiles "enable_kauth" is 0 so the flag "--enable-kauth" is not ever used. It was just easier to add a "%define enable_kauth 0" macro then to erase all the kauth bits like stuff in the %files section. I could add other configure options if anyone thinks it might help? In the dkms package the confugure option are %configure --with-afs-sysname=3D%{sysname} --disable-kernel-module make libafs_tree and then the tree is copied into dkms. cp -a libafs_tree %{buildroot}%{_prefix}/src/%{module}-%{version} and then dkms does MAKE[0]=3D"( ./configure --with-linux-kernel-headers=3D\${kernel_source_dir= }; make; mv src/libafs/MODLOAD-*/libafs.ko ./\$PACKAGE_NAME.ko )" Where $PACKAGE_NAME.ko is openafs.ko\ Then it does in %post: %post dkms add -m %{module} -v %{version} --rpm_safe_upgrade &>/dev/null dkms build -m %{module} -v %{version} --rpm_safe_upgrade &>/dev/null dkms install -m %{module} -v %{version} --rpm_safe_upgrade &>/dev/null maybe dkms needs additional configuration options for the kernel module part for ppc64? Thanks for any ideas anyone has about these errors on ppc64. I will try ppc64le again if I am able to solve the ppc64 errors somehow. :) The weird thing is that 1.6.22.2 worked on ppc64 whereas 1.8.0pre5 does not= . My ultimate goal was to try to get openafs working on other arches besides x86_64 arch in Linux. Systems like IBM Power7 / Power8 / aarch64 / arm7hl. Hopefully in fedora and EPEL. (RHEL and CentOS) But maybe its too hard of a problem and I will just need to stick with x86_64 arch. :) Thanks, --000000000000cbd07705672c52fb Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Hmm, it is as if CV_TIMEDWAIT() somehow got #defined away.

I see from the pastebin that you are basing your work off 1.6.22; I
would recommend starting again from master (or 1.8.0pre5 which is
pretty similar), since (1) new code would have to go through master
anyway, and (2) master has some changes in this area, using the
OpenAFS Portable Runtime (opr) library instead of directly using
pthread calls, which may or may not be relevant.

-Ben

Sorry this email is kind of long.

I updated some rpms based on=C2= =A0openafs-1.8.0pre5. These rpms I am using were worked on by Jack Neely an= d Ken Dreyer=C2=A0before they both stopped working on openafs when they got= new jobs a few years back.. These rpms used to live in the popular fedora = third party yum repository "rpmfusion" since for legal reasons th= ey could not be included in fedora proper. But since they have been abandon= ed by both authors and were dropped from rpmfusion I have been updating the= m locally for our users in the college of engineering at NCSU for a few yea= rs now. We have about 400 - 800 Linux machines in the college that depend o= n afs filesystems of some sort. maybe 200 of those use ubuntu 16.04 so we c= an use openafs debian packages (for now) on the ubuntu machines. (since the= y do not have the same legal restrictions like red hat does in debian)

rpms using= 1.8.0pre5 worked in a fedora 27 x86_64 virtual machine earlier today but i= t seemed kind of slow compared to 1.6.22.2. I know "seemed kind of slo= w" is subjective.=C2=A0 I was able to edit files in my $HOME directory= so it was working. 1.8.0pre5 compiled and installed on ppc64 emulator. (qe= mu/virt-manager running on a x86_64 laptop) but it won't start. The err= or I get is:

[root@localhost ~]# systemctl start opena= fs-client
Job for openafs-client.service fa= iled because a timeout was exceeded.
See &q= uot;systemctl=C2=A0 status openafs-client.service" and "journalct= l=C2=A0 -xe" for details

There is some kind of error on the console. Here is= what it looks like:

The error mes= sage on the console pops up as soon as you start the service.

[root@localhost ~]#= systemctl=C2=A0 status openafs-client.service

=E2= =97=8F openafs-client.service - OpenAFS Client Service
=C2=A0 =C2=A0Loaded: loaded (/usr/lib/systemd/system/openafs-cl= ient.service; disabled; vendor preset: disabled)
=C2=A0 =C2=A0Active: failed (Result: timeout) since Sun 2018-03-11 19= :32:42 EDT; 16s ago
=C2=A0 Process: 1279 Ex= ecStartPre=3D/sbin/modprobe openafs (code=3Dexited, status=3D0/SUCCESS)
=C2=A0 Process: 1277 ExecStartPre=3D/bin/bash = -c fs sysname > /dev/null 2>/dev/null; test $? -ne 0 || (echo AFS cli= ent appears to be run
=C2=A0 =C2=A0 Tasks: = 6 (limit: 4915)
=C2=A0 =C2=A0CGroup: /syste= m.slice/openafs-client.service
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=E2=94=9C=E2=94=801281 /usr/sbin/afsd -afsdb= -dynroot -fakestat -memcache -blocks 102400 -daemons 3 -confdir /etc/opena= fs
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0=E2=94=9C=E2=94=801285 /usr/sbin/afsd -afsdb -dynroot -fakestat -memcach= e -blocks 102400 -daemons 3 -confdir /etc/openafs
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=E2=94=9C=E2=94=801286 /usr= /sbin/afsd -afsdb -dynroot -fakestat -memcache -blocks 102400 -daemons 3 -c= onfdir /etc/openafs
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0=E2=94=94=E2=94=801288 /usr/sbin/afsd -afsdb -dynroot -= fakestat -memcache -blocks 102400 -daemons 3 -confdir /etc/openafs

Mar 11 19:31:1= 1 localhost.localdomain systemd[1]: Starting OpenAFS Client Service...
Mar 11 19:32:42 localhost.localdomain systemd[1= ]: openafs-client.service: Start operation timed out. Terminating.
Mar 11 19:32:42 localhost.localdomain systemd[1]: F= ailed to start OpenAFS Client Service.
Mar = 11 19:32:42 localhost.localdomain systemd[1]: openafs-client.service: Unit = entered failed state.
Mar 11 19:32:42 local= host.localdomain systemd[1]: openafs-client.service: Failed with result = 9;timeout'.

[root@localhost ~]# journalctl -xe=

Mar 11 19:31:20 localhost.localdomain au= dit[1]: SERVICE_START pid=3D1 uid=3D0 auid=3D4294967295 ses=3D4294967295 su= bj=3Dsystem_u:system_r:init_t:s0 msg=3D'unit=3Dpolkit comm=3D"syst= emd
Mar 11 19:31:20 localhost.localdomain dbus-daemon[681]: [syst= em] Successfully activated service 'org.freedesktop.problems'
=
Mar 11 19:31:21 localhost.localdomain abrt-notification[1311]: System = encountered a non-fatal error in rxi_ReapConnections()
-- Subject= : ABRT has detected a non-fatal system error
-- Defined-By: ABRT<= /div>
-- Documentation: man:abrt(1)
--= =C2=A0
-- Unable to handle kernel paging request for data at addr= ess 0x00000000 [libafs]
--=C2=A0
-- Use the abrt comman= d-line tool for further analysis or to report
-- the problem to t= he appropriate support site.
Mar 11 19:32:42 localhost.localdomai= n systemd[1]: openafs-client.service: Start operation timed out. Terminatin= g.
Mar 11 19:32:42 localhost.localdomain audit[1]: SERVICE_START = pid=3D1 uid=3D0 auid=3D4294967295 ses=3D4294967295 subj=3Dsystem_u:system_r= :init_t:s0 msg=3D'unit=3Dopenafs-client comm=3D
Mar 11 19:32:= 42 localhost.localdomain systemd[1]: Failed to start OpenAFS Client Service= .
-- Subject: Unit openafs-client.service has failed
--= Defined-By: systemd
--=C2=A0
-- Unit openafs= -client.service has failed.
--=C2=A0
-- The result is f= ailed.
Mar 11 19:32:42 localhost.localdomain systemd[1]: openafs-= client.service: Unit entered failed state.
Mar 11 19:32:42 localh= ost.localdomain systemd[1]: openafs-client.service: Failed with result '= ;timeout'.

I am using dkms to handle com= piling the kernel module part of openafs.

It worked with 1.6.22.2 on ppc64 arch so I'm not su= re what could have changed in 1.8.0pre5 with respect to ppc64 architecture.=

Maybe= it needs a different config options for ppc64? Or maybe I need to try it o= n a faster pc since I running a emulator. I am unsure. maybe its just too h= ard of a problem to solve.

Here are the configure flags I am using in the openafs= spec file:

# build the user-space bits for base archi= tectures
=C2=A0 =C2=A0 ./configure \
<= div class=3D"gmail_extra">=C2=A0 =C2=A0 =C2=A0 =C2=A0 --prefix=3D%{_prefix}= \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --libdir=3D%= {_libdir} \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --b= indir=3D%{_bindir} \
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 --sbindir=3D%{_sbindir} \
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 --sysconfdir=3D%{_sysconfdir} \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --localstatedir=3D%{_var} \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --with-afs-sysname=3D%{sysn= ame} \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --with-l= inux-kernel-headers=3D%{ksource_dir} \
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 --disable-kernel-module \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --disable-strip-binaries \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --enable-supergroupsb \
%if %{enable_kauth}
=C2=A0 =C2=A0 =C2=A0 =C2=A0 --enable-kauth \
%endif
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = --enable-debug

I had to add "=C2=A0--enable-debug" to make the debug pa= ckages build in fedora 27. The files in the regular packages (non debug one= s) get striped.

In all compiles "enable_kauth" is 0 so the flag "-= -enable-kauth" is not ever used. It was just easier to add a=C2=A0 &qu= ot;%define enable_kauth=C2=A0 =C2=A0 0" macro =C2=A0then to erase all = the kauth bits like stuff in the %files section.=C2=A0 I could add other co= nfigure options if anyone thinks it might help?=C2=A0

<= div>In the dkms package the confugure option are

%= configure --with-afs-sysname=3D%{sysname} --disable-kernel-module
=
make libafs_tree

and then the tree is cop= ied into dkms.
cp -a libafs_tree %{buildroot}%{_prefix}/src/%{mod= ule}-%{version}

and then dkms does
<= br>
MAKE[0]=3D"( ./configure --with-linux-kernel-headers=3D\= ${kernel_source_dir}; make; mv src/libafs/MODLOAD-*/libafs.ko ./\$PACKAGE_N= AME.ko )"

Where $PACKAGE_NAME.ko is opena= fs.ko\

Then it does in %post:

=

%post
dkms add -m %{module} -v %{version= } --rpm_safe_upgrade &>/dev/null
dkms build -m %{module} -= v %{version} --rpm_safe_upgrade &>/dev/null
dkms install -= m %{module} -v %{version} --rpm_safe_upgrade &>/dev/null
=

maybe dkms needs additional configuration options for t= he kernel module part for ppc64?

Thanks for any ideas anyone has about these errors on ppc64. I wil= l try ppc64le again if I am able to solve the ppc64 errors somehow. :)

The weird thing is that 1.6.= 22.2 worked on ppc64 whereas=C2=A01.8.0pre5 does not.

My ul= timate goal was to try to get openafs working on other arches besides x86_6= 4 arch in Linux. Systems like IBM Power7 / Power8 / aarch64 / arm7hl. Hopef= ully in fedora and EPEL. (RHEL and CentOS) But maybe its too hard of a prob= lem and I will just need to stick with x86_64 arch. :)

Than= ks,


<= /div>
--000000000000cbd07705672c52fb-- From gaja.peters@math.uni-hamburg.de Sat Mar 17 16:09:25 2018 From: gaja.peters@math.uni-hamburg.de (Gaja Sophie Peters) Date: Sat, 17 Mar 2018 16:09:25 +0100 Subject: [OpenAFS] Linux: systemctl --user vs. AFS In-Reply-To: References: <7f6d69d7-859d-722b-74a3-73e23621bca5@altum.de> Message-ID: Am 08.03.2018 um 20:08 schrieb Jonathan Billings: > There's a google doc in the Debian bug that I wrote > (https://docs.google.com/document/d/1P27fP1uj-C8QdxDKMKtI-Qh00c5_9zJa4YHjnpB6ODM/pub), > which was to create an /etc/systemd/user/aklog.service that is > automatically started as part of the login, I did some testing on Ubuntu 18.04 alpha (or beta?), and ran into the same problem, which I solved with a variant of the above, which seems to work for the time being. The systemd-file itself goes to /etc/systemd/user/aklog.service The link to start it goes to /etc/systemd/user/default.target.wants Main advantage of course, that you don't have to make your AFS-Homedirectory world-readable... > what it does is runs an > aklog so that the processes started by systemd --user have tokens.  This > assumes that it's got its own keyring. This seems to work. "xterm" and "gnome-terminal" are still in separate PAGs, but since both can read the Kerberos-Ticket, both can get the AFS-Token. I added an "unlog" to ExecStop, so that the Token will be destroyed on logout. Without that, the once-obtained token will remain, even after logout and immidiate re-login. (Tested with manual, non-scripted aklog...) > This works, to a certain extent.  I also have a startup script that I > wrote that runs dbus-monitor to watch org.gnome.ScreenSaver, and restart > the aklog.service user service every time you unlock the screensaver, so > those tokens get renewed with the updated krb5 credentials. I tried to combine both parts into a single "aklog.service" file (see below). I don't know much about systemd and even less about dbus, so there might be things that are backwards... An added complication for me was that at the point where I wanted the aklog.service to be executed, the environment-variable KRB5CCNAME wasn't yet set, so I used a somewhat hackish fragment to construct the variable from the file that existed already in /tmp. File /etc/systemd/user/aklog.service >>>>>>>>>>>>>>>>>>>>> [Unit] Description=aklog for session --user Before=gnome-keyring-ssh.service [Service] Type=simple ExecStartPre=/bin/sh -c ' \ KRB5CCNAME=FILE:$(ls -t /tmp/krb5cc_${XDG_RUNTIME_DIR#/run/user/}*|head -1) aklog -d' ExecStart=/bin/sh -c ' \ dbus-monitor --profile path=/org/freedesktop/secrets/collection/login | \ while read TYPE LINE; \ do \ [ "$TYPE" = "mc" ] && systemctl --user reload aklog; \ done' ExecReload=/usr/bin/aklog ExecStop=/usr/bin/unlog [Install] WantedBy=default.target >>>>>>>>>>>>>>>>>>>>> Explanation: the dbus-monitor needs to run all the time for new logins, so I made it the main-process of the service. Before that, aklog needs to be started with a (re-)constructed KRB5CCNAME which is at that time missing from the environment, so I look for the newest krb5cc-file with the current user-ID in /tmp. The user-ID itself doesn't exist in the environment at that point, only the username (in $LOGNAME and $USER), however the userid is found as part of the $XDG_RUNTIME_DIR, so I used that. The dbus-monitor watches "something" that seems to be called exactly once on each login - no idea if there are better things to watch for (disadvantage of screensaver seemed to be that there are two lines, one for locking, one for unlocking). The first few returned lines start with "sig" or "#" and aren't interesting, the interesting lines have "mc" as their first word and will reload the aklog.service (KRB5CCNAME doesn't need to be set again). When the service ends, unlog will destroy the AFS-Token. For the "Before=" line, I simply looked for something that was run fairly early in the boot-process and told systemd, I want to run even before that. Greetings, Gaja Peters P.S. I have tested this ONLY in Ubuntu 18.04, it might be completely different in another system! D-Bus might have to be monitored for something else, and the variable XDG_RUNTIME_DIR might point to something different. From kfiresmith@gmail.com Fri Mar 23 12:27:15 2018 From: kfiresmith@gmail.com (Kodiak Firesmith) Date: Fri, 23 Mar 2018 07:27:15 -0400 Subject: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up In-Reply-To: References: <54924524-154a-bee0-1719-77f8af636f63@auristor.com> <20180209000156.GM12363@mit.edu> Message-ID: --089e082243ccd41082056812b4a1 Content-Type: text/plain; charset="UTF-8" I've also tested gsgatlin's 7.5beta RPMs and they work great. Any chance we'll see the rh75enotdir patch integrated into a release of 1.6.22.3 soon? I'm wondering if it'll be worth it to manually apply that patch to a rebuild of the official OpenAFS RPMs if this isn't on the block for being merged and released soon - but I don't want to blow the time applying that patch to a re-roll if a fixed official release is forthcoming. Thanks! - Kodiak On Fri, Mar 2, 2018 at 3:47 AM, Anders Nordin wrote: > Hello, > > Is there any progress on this issue? Can we expect a stable release for > RHEL 7.5? > > MVH > Anders > > -----Original Message----- > From: openafs-info-admin@openafs.org [mailto:openafs-info-admin@ope > nafs.org] On Behalf Of Benjamin Kaduk > Sent: den 9 februari 2018 01:02 > To: Kodiak Firesmith > Cc: openafs-info > Subject: Re: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up > > On Wed, Feb 07, 2018 at 11:46:28AM -0500, Kodiak Firesmith wrote: > > Hello again All, > > > > As part of continued testing, I've been able to confirm that the > > SystemD double-service startup thing only happens to my hosts when > > going from RHEL > > 7.4 to RHEL 7.5beta. On a test host installed directly as RHEL > > 7.5beta, I get a bit farther with 1.6.18.22, in that I get to the > > point where OpenAFS "kind of" works. > > Thanks for tracking this down. The rpm packaging maintainers may want to > try to track down why the double-start happens in the upgrade scenario, as > that's pretty nasty behavior. > > > What I'm observing is that the openafs client Kernel module (built by > > DKMS) loads fine, and just so long as you know where you need to go in > > /afs, you can get there, and you can read and write files and the > OpenAFS 'fs' > > command works. But doing an 'ls' of /afs or any path underneath > > results in > > "ls: reading directory /afs/: Not a directory". > > > > I ran an strace of a good RHEL 7.4 host running ls on /afs, and a RHEL > > 7.5beta host running ls on /afs and have created pastebins of both, as > > well as an inline diff. > > > > All can be seen at the following locations: > > > > works > > https://paste.fedoraproject.org/paste/Hiojt2~Be3wgez47bKNucQ > > > > fails > > https://paste.fedoraproject.org/paste/13ZXBfJIOMsuEJFwFShBfg > > > > > > diff > > https://paste.fedoraproject.org/paste/FJKRwep1fWJogIDbLnkn8A > > > > Hopefully this might help the OpenAFS devs, or someone might know what > > might be borking on every RHEL 7.5 beta host. It does fit with what > > other > > 7.5 beta users have observed OpenAFS doing. > > Yes, now it seems like all our reports are consistent, and we just have to > wait for a developer to get a better look at what Red Hat changed in the > kernel that we need to adapt to. > > -Ben > > > Thanks! > > - Kodiak > > > > On Mon, Feb 5, 2018 at 12:31 PM, Stephan Wiesand > > > > wrote: > > > > > > > > > On 04.Feb 2018, at 02:11, Jeffrey Altman > wrote: > > > > > > > > On 2/2/2018 6:04 PM, Kodiak Firesmith wrote: > > > >> I'm relatively new to handling OpenAFS. Are these problems part > > > >> of a normal "kernel release; openafs update" cycle and perhaps > > > >> I'm getting snagged just by being too early of an adopter? I > > > >> wanted to raise the alarm on this and see if anything else was > > > >> needed from me as the reporter of the issue, but perhaps that's > > > >> an overreaction to what is just part of a normal process I just > > > >> haven't been tuned into in prior RHEL release cycles? > > > > > > > > > > > > Kodiak, > > > > > > > > On RHEL, DKMS is safe to use for kernel modules that restrict > > > > themselves to using the restricted set of kernel interfaces (the > > > > RHEL KABI) that Red Hat has designated will be supported across > > > > the lifespan of the RHEL major version number. OpenAFS is not > > > > such a kernel module. As a result it is vulnerable to breakage each > and every time a new kernel is shipped. > > > > > > Jeffrey, > > > > > > the usual way to use DKMS is to either have it build a module for a > > > newly installed kernel or install a prebuilt module for that kernel. > > > It may be possible to abuse it for providing a module built for > > > another kernel, but I think that won't happen accidentally. > > > > > > You may be confusing DKMS with RHEL's "KABI tracking kmods". Those > > > should be safe to use within a RHEL minor release (and the SL > > > packaging has been using them like this since EL6.4), but aren't > > > across minor releases (and that's why the SL packaging modifies the > > > kmod handling to require a build for the minor release in question. > > > > > > > There are two types of failures that can occur: > > > > > > > > 1. a change results in failure to build the OpenAFS kernel module > > > > for the new kernel > > > > > > > > 2. a change results in the OpenAFS kernel module building and > > > > successfully loading but failing to operate correctly > > > > > > The latter shouldn't happen within a minor release, but can across > > > minor releases. > > > > > > > It is the second of these possibilities that has taken place with > > > > the release of the 3.10.0-830.el7 kernel shipped as part of the > > > > RHEL 7.5 > > > beta. > > > > > > > > Are you an early adopter of RHEL 7.5 beta? Absolutely, its a beta > > > > release and as such you should expect that there will be bugs and > > > > that third party kernel modules that do not adhere to the KABI > > > > functionality might have compatibility issues. > > > > > > The -830 kernel can break 3rd-party modules using non-whitelisted > > > ABIs, whether or not they adhere to the "KABI functionality". > > > > > > > There was a compatibility issue with RHEL 7.4 kernel > > > > (3.10.0_693.1.1.el7) as well that was only fixed in the OpenAFS > > > > 1.6 release series this past week as part of 1.6.22.2: > > > > > > > > http://www.openafs.org/dl/openafs/1.6.22.2/RELNOTES-1.6.22.2 > > > > > > Yes, and this one was hard to fix. Thanks are due to Mark Vitale for > > > developing the fix and all those who reviewed and tested it. > > > > > > > Jeffrey Altman > > > > AuriStor, Inc. > > > > > > > > P.S. - Welcome to the community. > > > > > > Seconded. In particular, the problem report regarding the EL7.5beta > > > kernel was absolutely appropriate. > > > > > > -- > > > Stephan Wiesand > > > DESY - DV - > > > Platanenallee 6 > > > 15738 Zeuthen, Germany > > > > > > > > > > _______________________________________________ > OpenAFS-info mailing list > OpenAFS-info@openafs.org > https://lists.openafs.org/mailman/listinfo/openafs-info > --089e082243ccd41082056812b4a1 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I've also tested gsgatlin&#= 39;s 7.5beta RPMs and they work great.=C2=A0 Any chance we'll see the= =C2=A0rh75enotdir patch integrated into a release of 1.6.22= .3 soon?=C2=A0 I'm wondering if it'll be worth it to manually apply= that patch to a rebuild of the official OpenAFS RPMs if this isn't on = the block for being merged and released soon - but I don't want to blow= the time applying that patch to a re-roll if a fixed official release is f= orthcoming.

Thanks!
=C2=A0- Kodiak


On Fri, Mar 2, 2018 at 3:47 AM, Anders Nordin <anders.j.n= ordin@ltu.se> wrote:
Hello,=

Is there any progress on this issue? Can we expect a stable release for RHE= L 7.5?

MVH
Anders

-----Original Message-----
From: o= penafs-info-admin@openafs.org [mailto:openafs-info-admin@openafs.org]= On Behalf Of Benjamin Kaduk
Sent: den 9 februari 2018 01:02
To: Kodiak Firesmith <kfiresmith@gmail.com>
Cc: openafs-info <openafs-info@openafs.org>
Subject: Re: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up=

On Wed, Feb 07, 2018 at 11:46:28AM -0500, Kodiak Firesmith wrote:
> Hello again All,
>
> As part of continued testing, I've been able to confirm that the > SystemD double-service startup thing only happens to my hosts when
> going from RHEL
> 7.4 to RHEL 7.5beta.=C2=A0 On a test host installed directly as RHEL > 7.5beta, I get a bit farther with 1.6.18.22, in that I get to the
> point where OpenAFS "kind of" works.

Thanks for tracking this down.=C2=A0 The rpm packaging maintainers may want= to try to track down why the double-start happens in the upgrade scenario,= as that's pretty nasty behavior.

> What I'm observing is that the openafs client Kernel module (built= by
> DKMS) loads fine, and just so long as you know where you need to go in=
> /afs, you can get there, and you can read and write files and the Open= AFS 'fs'
> command works.=C2=A0 But doing an 'ls' of /afs or any path und= erneath
> results in
> "ls: reading directory /afs/: Not a directory".
>
> I ran an strace of a good RHEL 7.4 host running ls on /afs, and a RHEL=
> 7.5beta host running ls on /afs and have created pastebins of both, as=
> well as an inline diff.
>
> All can be seen at the following locations:
>
> works
> https://paste.fedoraproject.or= g/paste/Hiojt2~Be3wgez47bKNucQ
>
> fails
> https://paste.fedoraproject.or= g/paste/13ZXBfJIOMsuEJFwFShBfg
>
>
> diff
> https://paste.fedoraproject.or= g/paste/FJKRwep1fWJogIDbLnkn8A
>
> Hopefully this might help the OpenAFS devs, or someone might know what=
> might be borking on every RHEL 7.5 beta host.=C2=A0 It does fit with w= hat
> other
> 7.5 beta users have observed OpenAFS doing.

Yes, now it seems like all our reports are consistent, and we just have to = wait for a developer to get a better look at what Red Hat changed in the ke= rnel that we need to adapt to.

-Ben

> Thanks!
>=C2=A0 - Kodiak
>
> On Mon, Feb 5, 2018 at 12:31 PM, Stephan Wiesand
> <steph= an.wiesand@desy.de>
> wrote:
>
> >
> > > On 04.Feb 2018, at 02:11, Jeffrey Altman <jaltman@auristor.com> wro= te:
> > >
> > > On 2/2/2018 6:04 PM, Kodiak Firesmith wrote:
> > >> I'm relatively new to handling OpenAFS.=C2=A0 Are th= ese problems part
> > >> of a normal "kernel release; openafs update" c= ycle and perhaps
> > >> I'm getting snagged just by being too early of an ad= opter?=C2=A0 I
> > >> wanted to raise the alarm on this and see if anything el= se was
> > >> needed from me as the reporter of the issue, but perhaps= that's
> > >> an overreaction to what is just part of a normal process= I just
> > >> haven't been tuned into in prior RHEL release cycles= ?
> > >
> > >
> > > Kodiak,
> > >
> > > On RHEL, DKMS is safe to use for kernel modules that restric= t
> > > themselves to using the restricted set of kernel interfaces = (the
> > > RHEL KABI) that Red Hat has designated will be supported acr= oss
> > > the lifespan of the RHEL major version number.=C2=A0 OpenAFS= is not
> > > such a kernel module.=C2=A0 As a result it is vulnerable to = breakage each and every time a new kernel is shipped.
> >
> > Jeffrey,
> >
> > the usual way to use DKMS is to either have it build a module for= a
> > newly installed kernel or install a prebuilt module for that kern= el.
> > It may be possible to abuse it for providing a module built for > > another kernel, but I think that won't happen accidentally. > >
> > You may be confusing DKMS with RHEL's "KABI tracking kmo= ds". Those
> > should be safe to use within a RHEL minor release (and the SL
> > packaging has been using them like this since EL6.4), but aren= 9;t
> > across minor releases (and that's why the SL packaging modifi= es the
> > kmod handling to require a build for the minor release in questio= n.
> >
> > > There are two types of failures that can occur:
> > >
> > > 1. a change results in failure to build the OpenAFS kernel m= odule
> > >=C2=A0 =C2=A0 for the new kernel
> > >
> > > 2. a change results in the OpenAFS kernel module building an= d
> > >=C2=A0 =C2=A0 successfully loading but failing to operate cor= rectly
> >
> > The latter shouldn't happen within a minor release, but can a= cross
> > minor releases.
> >
> > > It is the second of these possibilities that has taken place= with
> > > the release of the 3.10.0-830.el7 kernel shipped as part of = the
> > > RHEL 7.5
> > beta.
> > >
> > > Are you an early adopter of RHEL 7.5 beta?=C2=A0 Absolutely,= its a beta
> > > release and as such you should expect that there will be bug= s and
> > > that third party kernel modules that do not adhere to the KA= BI
> > > functionality might have compatibility issues.
> >
> > The -830 kernel can break 3rd-party modules using non-whitelisted=
> > ABIs, whether or not they adhere to the "KABI functionality&= quot;.
> >
> > > There was a compatibility issue with RHEL 7.4 kernel
> > > (3.10.0_693.1.1.el7) as well that was only fixed in the Open= AFS
> > > 1.6 release series this past week as part of 1.6.22.2:
> > >
> > >=C2=A0 http://www.openafs.= org/dl/openafs/1.6.22.2/RELNOTES-1.6.22.2
> >
> > Yes, and this one was hard to fix. Thanks are due to Mark Vitale = for
> > developing the fix and all those who reviewed and tested it.
> >
> > > Jeffrey Altman
> > > AuriStor, Inc.
> > >
> > > P.S. - Welcome to the community.
> >
> > Seconded. In particular, the problem report regarding the EL7.5be= ta
> > kernel was absolutely appropriate.
> >
> > --
> > Stephan Wiesand
> > DESY - DV -
> > Platanenallee 6
> > 15738 Zeuthen, Germany
> >
> >
> >
_______________________________________________
OpenAFS-info mailing list
OpenAFS-info@= openafs.org
https://lists.openafs.org/mailman/listin= fo/openafs-info

--089e082243ccd41082056812b4a1-- From stephan.wiesand@desy.de Fri Mar 23 14:50:05 2018 From: stephan.wiesand@desy.de (Stephan Wiesand) Date: Fri, 23 Mar 2018 14:50:05 +0100 Subject: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel lock up In-Reply-To: References: <54924524-154a-bee0-1719-77f8af636f63@auristor.com> <20180209000156.GM12363@mit.edu> Message-ID: > On 23. Mar 2018, at 12:27, Kodiak Firesmith = wrote: >=20 > I've also tested gsgatlin's 7.5beta RPMs and they work great. Any = chance we'll see the rh75enotdir patch integrated into a release of = 1.6.22.3 soon? I'm wondering if it'll be worth it to manually apply = that patch to a rebuild of the official OpenAFS RPMs if this isn't on = the block for being merged and released soon - but I don't want to blow = the time applying that patch to a re-roll if a fixed official release is = forthcoming. We are planning to release a 1.6.22.3 addressing the ENOTDIR issue with = the EL7.5 kernel soon after the EL7.5 GA release. - Stephan > Thanks! > - Kodiak >=20 >=20 > On Fri, Mar 2, 2018 at 3:47 AM, Anders Nordin = wrote: > Hello, >=20 > Is there any progress on this issue? Can we expect a stable release = for RHEL 7.5? >=20 > MVH > Anders >=20 > -----Original Message----- > From: openafs-info-admin@openafs.org = [mailto:openafs-info-admin@openafs.org] On Behalf Of Benjamin Kaduk > Sent: den 9 februari 2018 01:02 > To: Kodiak Firesmith > Cc: openafs-info > Subject: Re: [OpenAFS] RHEL 7.5 beta / 3.10.0-830.el7.x86_66 kernel = lock up >=20 > On Wed, Feb 07, 2018 at 11:46:28AM -0500, Kodiak Firesmith wrote: > > Hello again All, > > > > As part of continued testing, I've been able to confirm that the > > SystemD double-service startup thing only happens to my hosts when > > going from RHEL > > 7.4 to RHEL 7.5beta. On a test host installed directly as RHEL > > 7.5beta, I get a bit farther with 1.6.18.22, in that I get to the > > point where OpenAFS "kind of" works. >=20 > Thanks for tracking this down. The rpm packaging maintainers may want = to try to track down why the double-start happens in the upgrade = scenario, as that's pretty nasty behavior. >=20 > > What I'm observing is that the openafs client Kernel module (built = by > > DKMS) loads fine, and just so long as you know where you need to go = in > > /afs, you can get there, and you can read and write files and the = OpenAFS 'fs' > > command works. But doing an 'ls' of /afs or any path underneath > > results in > > "ls: reading directory /afs/: Not a directory". > > > > I ran an strace of a good RHEL 7.4 host running ls on /afs, and a = RHEL > > 7.5beta host running ls on /afs and have created pastebins of both, = as > > well as an inline diff. > > > > All can be seen at the following locations: > > > > works > > https://paste.fedoraproject.org/paste/Hiojt2~Be3wgez47bKNucQ > > > > fails > > https://paste.fedoraproject.org/paste/13ZXBfJIOMsuEJFwFShBfg > > > > > > diff > > https://paste.fedoraproject.org/paste/FJKRwep1fWJogIDbLnkn8A > > > > Hopefully this might help the OpenAFS devs, or someone might know = what > > might be borking on every RHEL 7.5 beta host. It does fit with what > > other > > 7.5 beta users have observed OpenAFS doing. >=20 > Yes, now it seems like all our reports are consistent, and we just = have to wait for a developer to get a better look at what Red Hat = changed in the kernel that we need to adapt to. >=20 > -Ben >=20 > > Thanks! > > - Kodiak > > > > On Mon, Feb 5, 2018 at 12:31 PM, Stephan Wiesand > > > > wrote: > > > > > > > > > On 04.Feb 2018, at 02:11, Jeffrey Altman = wrote: > > > > > > > > On 2/2/2018 6:04 PM, Kodiak Firesmith wrote: > > > >> I'm relatively new to handling OpenAFS. Are these problems = part > > > >> of a normal "kernel release; openafs update" cycle and perhaps > > > >> I'm getting snagged just by being too early of an adopter? I > > > >> wanted to raise the alarm on this and see if anything else was > > > >> needed from me as the reporter of the issue, but perhaps that's > > > >> an overreaction to what is just part of a normal process I just > > > >> haven't been tuned into in prior RHEL release cycles? > > > > > > > > > > > > Kodiak, > > > > > > > > On RHEL, DKMS is safe to use for kernel modules that restrict > > > > themselves to using the restricted set of kernel interfaces (the > > > > RHEL KABI) that Red Hat has designated will be supported across > > > > the lifespan of the RHEL major version number. OpenAFS is not > > > > such a kernel module. As a result it is vulnerable to breakage = each and every time a new kernel is shipped. > > > > > > Jeffrey, > > > > > > the usual way to use DKMS is to either have it build a module for = a > > > newly installed kernel or install a prebuilt module for that = kernel. > > > It may be possible to abuse it for providing a module built for > > > another kernel, but I think that won't happen accidentally. > > > > > > You may be confusing DKMS with RHEL's "KABI tracking kmods". Those > > > should be safe to use within a RHEL minor release (and the SL > > > packaging has been using them like this since EL6.4), but aren't > > > across minor releases (and that's why the SL packaging modifies = the > > > kmod handling to require a build for the minor release in = question. > > > > > > > There are two types of failures that can occur: > > > > > > > > 1. a change results in failure to build the OpenAFS kernel = module > > > > for the new kernel > > > > > > > > 2. a change results in the OpenAFS kernel module building and > > > > successfully loading but failing to operate correctly > > > > > > The latter shouldn't happen within a minor release, but can across > > > minor releases. > > > > > > > It is the second of these possibilities that has taken place = with > > > > the release of the 3.10.0-830.el7 kernel shipped as part of the > > > > RHEL 7.5 > > > beta. > > > > > > > > Are you an early adopter of RHEL 7.5 beta? Absolutely, its a = beta > > > > release and as such you should expect that there will be bugs = and > > > > that third party kernel modules that do not adhere to the KABI > > > > functionality might have compatibility issues. > > > > > > The -830 kernel can break 3rd-party modules using non-whitelisted > > > ABIs, whether or not they adhere to the "KABI functionality". > > > > > > > There was a compatibility issue with RHEL 7.4 kernel > > > > (3.10.0_693.1.1.el7) as well that was only fixed in the OpenAFS > > > > 1.6 release series this past week as part of 1.6.22.2: > > > > > > > > http://www.openafs.org/dl/openafs/1.6.22.2/RELNOTES-1.6.22.2 > > > > > > Yes, and this one was hard to fix. Thanks are due to Mark Vitale = for > > > developing the fix and all those who reviewed and tested it. > > > > > > > Jeffrey Altman > > > > AuriStor, Inc. > > > > > > > > P.S. - Welcome to the community. > > > > > > Seconded. In particular, the problem report regarding the = EL7.5beta > > > kernel was absolutely appropriate. --=20 Stephan Wiesand DESY -DV- Platanenallee 6 15738 Zeuthen, Germany