[OpenAFS-devel] 1.8.x AIX support

Ben Huntsman ben@huntsmans.net
Tue, 16 May 2023 20:28:01 +0000


--_000_MWHPR0701MB3674BBF00BF8617E0514D7D9A7799MWHPR0701MB3674_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Hi there!
   Yes, that was it!  I pulled that in and it solved the issue.  With 15106=
 and 14705 applied, pus the other three you already have proposed in gerrit=
, the 1.8.x branch works on AIX.

Thank you!

-Ben
________________________________
From: Cheyenne Wills <cwills@sinenomine.net>
Sent: Tuesday, May 16, 2023 11:10 AM
To: Ben Huntsman <ben@huntsmans.net>
Cc: openafs-devel@openafs.org <openafs-devel@openafs.org>
Subject: Re: [OpenAFS-devel] 1.8.x AIX support

On Tue, 16 May 2023 17:21:58 +0000
Ben Huntsman <ben@huntsmans.net> wrote:

> Hi there!
>    Here is the backtrace with a debug build:
>
> bash-4.2# dbx /opt/openafs/libexec/openafs/vlserver core
> Type 'help' for help.
> [using memory image in core]
> reading symbolic information ...
>
> IOT/Abort trap in pthread_kill at 0xd054cb34 ($t2)
> 0xd054cb34 (pthread_kill+0xb4) 80410014            lwz   r2,0x14(r1)
> (dbx) where all
> Thread $t1
> _sigsetmask(??, ??, ??) at 0xd054b288
> _p_sigaction(??, ??, ??) at 0xd054be68
> raise.sigaction(??, ??, ??) at 0xd0120d50
> signal(??, ??) at 0xd021f6f4
> SetupLogSoftSignals(), line 469 in "serverLog.c"
> main(argc =3D 0, argv =3D (nil)), line 399 in "vlserver.c"
> Thread $t2
> pthread_kill(??, ??) at 0xd054cb34
> _p_raise(??) at 0xd054bf84
> raise.raise(??) at 0xd0121020
> abort() at 0xd017ca64
> opr_AssertionFailed(file =3D (nil), line =3D 0), line 29 in "assert.c"
> signalHandler(arg =3D (nil)), line 73 in "softsig.c"
>
>
> Hopefully that sheds more light on the situation?
>
> Thank you!
>
> -Ben
>
> ________________________________
> From: Cheyenne Wills <cwills@sinenomine.net>
> Sent: Tuesday, May 16, 2023 6:04 AM
> To: Ben Huntsman <ben@huntsmans.net>
> Cc: openafs-devel@openafs.org <openafs-devel@openafs.org>
> Subject: Re: [OpenAFS-devel] 1.8.x AIX support
>
> On Tue, 16 May 2023 04:46:40 +0000
> Ben Huntsman <ben@huntsmans.net> wrote:
>
> > Hi there-
> >    I see that the three AIX issues are being pulled up to 1.8.x.  I
> > just tried them out and found that we also need this one:
> >
> > 15106
> >
> >    We get a kernel panic on AIX as soon as afsd is started without
> > 15106 applied.
> >
> >    I pulled in 15106 and it compiles, but then I have another
> > problem that is that many of the servers coredump immediately:
> >
> > # /opt/openafs/libexec/openafs/vlserver
> > IOT/Abort trap(coredump)
> > # dbx /opt/openafs/libexec/openafs/vlserver core
> > Type 'help' for help.
> > [using memory image in core]
> > reading symbolic information ...warning: no source compiled with -g
> >
> >
> > IOT/Abort trap in pthread_kill at 0xd054cb34 ($t2)
> > 0xd054cb34 (pthread_kill+0xb4) 80410014            lwz   r2,0x14(r1)
> > (dbx) where all
> > Thread $t1
> > _sigsetmask(??, ??, ??) at 0xd054b288
> > _p_sigaction(??, ??, ??) at 0xd054be68
> > raise.sigaction(??, ??, ??) at 0xd0120d50
> > signal(??, ??) at 0xd021f6f4
> > .() at 0x100708e4
> > .() at 0x100019fc
> > Thread $t2
> > pthread_kill(??, ??) at 0xd054cb34
> > _p_raise(??) at 0xd054bf84
> > raise.raise(??) at 0xd0121020
> > abort() at 0xd017ca64
> > .() at 0x10005f8c
> > .() at 0x10071670
> >
> >
> > Are we missing another patch as well?  Anyone have an idea what it
> > might be?
> >
> > Thanks much!
> >
> > -Ben
> >
> >
>
> Can you try doing a build with --enable-debug as a configure option.
> That should provide information for the backtrace.
>
> --
> Cheyenne Wills
> cwills@sinenomine.net


Looks like gerrit 14705 (from master) is needed as well
--
Cheyenne Wills
cwills@sinenomine.net

--_000_MWHPR0701MB3674BBF00BF8617E0514D7D9A7799MWHPR0701MB3674_
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<html>
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-ascii"=
>
<style type=3D"text/css" style=3D"display:none;"> P {margin-top:0;margin-bo=
ttom:0;} </style>
</head>
<body dir=3D"ltr">
<div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size=
: 12pt; color: rgb(0, 0, 0);" class=3D"elementToProof">
Hi there!</div>
<div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size=
: 12pt; color: rgb(0, 0, 0);" class=3D"elementToProof">
&nbsp; &nbsp;Yes, that was it!&nbsp; I pulled that in and it solved the iss=
ue.&nbsp; With 15106 and 14705 applied, pus the other three you already hav=
e proposed in gerrit, the 1.8.x branch works on AIX.</div>
<div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size=
: 12pt; color: rgb(0, 0, 0);" class=3D"elementToProof">
<br>
</div>
<div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size=
: 12pt; color: rgb(0, 0, 0);" class=3D"elementToProof">
Thank you!</div>
<div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size=
: 12pt; color: rgb(0, 0, 0);" class=3D"elementToProof">
<br>
</div>
<div style=3D"font-family: Calibri, Arial, Helvetica, sans-serif; font-size=
: 12pt; color: rgb(0, 0, 0);" class=3D"elementToProof">
-Ben</div>
<div id=3D"appendonsend"></div>
<hr style=3D"display:inline-block;width:98%" tabindex=3D"-1">
<div id=3D"divRplyFwdMsg" dir=3D"ltr"><font face=3D"Calibri, sans-serif" st=
yle=3D"font-size:11pt" color=3D"#000000"><b>From:</b> Cheyenne Wills &lt;cw=
ills@sinenomine.net&gt;<br>
<b>Sent:</b> Tuesday, May 16, 2023 11:10 AM<br>
<b>To:</b> Ben Huntsman &lt;ben@huntsmans.net&gt;<br>
<b>Cc:</b> openafs-devel@openafs.org &lt;openafs-devel@openafs.org&gt;<br>
<b>Subject:</b> Re: [OpenAFS-devel] 1.8.x AIX support</font>
<div>&nbsp;</div>
</div>
<div class=3D"BodyFragment"><font size=3D"2"><span style=3D"font-size:11pt;=
">
<div class=3D"PlainText">On Tue, 16 May 2023 17:21:58 +0000<br>
Ben Huntsman &lt;ben@huntsmans.net&gt; wrote:<br>
<br>
&gt; Hi there!<br>
&gt;&nbsp;&nbsp;&nbsp; Here is the backtrace with a debug build:<br>
&gt; <br>
&gt; bash-4.2# dbx /opt/openafs/libexec/openafs/vlserver core<br>
&gt; Type 'help' for help.<br>
&gt; [using memory image in core]<br>
&gt; reading symbolic information ...<br>
&gt; <br>
&gt; IOT/Abort trap in pthread_kill at 0xd054cb34 ($t2)<br>
&gt; 0xd054cb34 (pthread_kill+0xb4) 80410014&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&=
nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; lwz&nbsp;&nbsp; r2,0x14(r1)<br>
&gt; (dbx) where all<br>
&gt; Thread $t1<br>
&gt; _sigsetmask(??, ??, ??) at 0xd054b288<br>
&gt; _p_sigaction(??, ??, ??) at 0xd054be68<br>
&gt; raise.sigaction(??, ??, ??) at 0xd0120d50<br>
&gt; signal(??, ??) at 0xd021f6f4<br>
&gt; SetupLogSoftSignals(), line 469 in &quot;serverLog.c&quot;<br>
&gt; main(argc =3D 0, argv =3D (nil)), line 399 in &quot;vlserver.c&quot;<b=
r>
&gt; Thread $t2<br>
&gt; pthread_kill(??, ??) at 0xd054cb34<br>
&gt; _p_raise(??) at 0xd054bf84<br>
&gt; raise.raise(??) at 0xd0121020<br>
&gt; abort() at 0xd017ca64<br>
&gt; opr_AssertionFailed(file =3D (nil), line =3D 0), line 29 in &quot;asse=
rt.c&quot;<br>
&gt; signalHandler(arg =3D (nil)), line 73 in &quot;softsig.c&quot;<br>
&gt; <br>
&gt; <br>
&gt; Hopefully that sheds more light on the situation?<br>
&gt; <br>
&gt; Thank you!<br>
&gt; <br>
&gt; -Ben<br>
&gt; <br>
&gt; ________________________________<br>
&gt; From: Cheyenne Wills &lt;cwills@sinenomine.net&gt;<br>
&gt; Sent: Tuesday, May 16, 2023 6:04 AM<br>
&gt; To: Ben Huntsman &lt;ben@huntsmans.net&gt;<br>
&gt; Cc: openafs-devel@openafs.org &lt;openafs-devel@openafs.org&gt;<br>
&gt; Subject: Re: [OpenAFS-devel] 1.8.x AIX support<br>
&gt; <br>
&gt; On Tue, 16 May 2023 04:46:40 +0000<br>
&gt; Ben Huntsman &lt;ben@huntsmans.net&gt; wrote:<br>
&gt; <br>
&gt; &gt; Hi there-<br>
&gt; &gt;&nbsp;&nbsp;&nbsp; I see that the three AIX issues are being pulle=
d up to 1.8.x.&nbsp; I<br>
&gt; &gt; just tried them out and found that we also need this one:<br>
&gt; &gt;<br>
&gt; &gt; 15106<br>
&gt; &gt;<br>
&gt; &gt;&nbsp;&nbsp;&nbsp; We get a kernel panic on AIX as soon as afsd is=
 started without<br>
&gt; &gt; 15106 applied.<br>
&gt; &gt;<br>
&gt; &gt;&nbsp;&nbsp;&nbsp; I pulled in 15106 and it compiles, but then I h=
ave another<br>
&gt; &gt; problem that is that many of the servers coredump immediately:<br=
>
&gt; &gt;<br>
&gt; &gt; # /opt/openafs/libexec/openafs/vlserver<br>
&gt; &gt; IOT/Abort trap(coredump)<br>
&gt; &gt; # dbx /opt/openafs/libexec/openafs/vlserver core<br>
&gt; &gt; Type 'help' for help.<br>
&gt; &gt; [using memory image in core]<br>
&gt; &gt; reading symbolic information ...warning: no source compiled with =
-g<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; IOT/Abort trap in pthread_kill at 0xd054cb34 ($t2)<br>
&gt; &gt; 0xd054cb34 (pthread_kill+0xb4) 80410014&nbsp;&nbsp;&nbsp;&nbsp;&n=
bsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; lwz&nbsp;&nbsp; r2,0x14(r1)<br>
&gt; &gt; (dbx) where all<br>
&gt; &gt; Thread $t1<br>
&gt; &gt; _sigsetmask(??, ??, ??) at 0xd054b288<br>
&gt; &gt; _p_sigaction(??, ??, ??) at 0xd054be68<br>
&gt; &gt; raise.sigaction(??, ??, ??) at 0xd0120d50<br>
&gt; &gt; signal(??, ??) at 0xd021f6f4<br>
&gt; &gt; .() at 0x100708e4<br>
&gt; &gt; .() at 0x100019fc<br>
&gt; &gt; Thread $t2<br>
&gt; &gt; pthread_kill(??, ??) at 0xd054cb34<br>
&gt; &gt; _p_raise(??) at 0xd054bf84<br>
&gt; &gt; raise.raise(??) at 0xd0121020<br>
&gt; &gt; abort() at 0xd017ca64<br>
&gt; &gt; .() at 0x10005f8c<br>
&gt; &gt; .() at 0x10071670<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; &gt; Are we missing another patch as well?&nbsp; Anyone have an idea w=
hat it<br>
&gt; &gt; might be?<br>
&gt; &gt;<br>
&gt; &gt; Thanks much!<br>
&gt; &gt;<br>
&gt; &gt; -Ben<br>
&gt; &gt;<br>
&gt; &gt;<br>
&gt; <br>
&gt; Can you try doing a build with --enable-debug as a configure option.<b=
r>
&gt; That should provide information for the backtrace.<br>
&gt; <br>
&gt; --<br>
&gt; Cheyenne Wills<br>
&gt; cwills@sinenomine.net<br>
<br>
<br>
Looks like gerrit 14705 (from master) is needed as well<br>
-- <br>
Cheyenne Wills<br>
cwills@sinenomine.net<br>
</div>
</span></font></div>
</body>
</html>

--_000_MWHPR0701MB3674BBF00BF8617E0514D7D9A7799MWHPR0701MB3674_--