[OpenAFS] 1.4.0 on AIX 5.3 system dumps

scajoan scajoan@spray.se
Thu, 30 Mar 2006 12:14:33 +0000 (GMT)


This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible.

--=_NextPart_Lycos_195611702109295_ID
Content-Type: text/html; charset="windows-1252"
Content-Transfer-Encoding: 7bit

<html><head><style type="text/css">body{font:12px Arial;margin:3px;overflow-y:auto;overflow-x:auto}p{margin:0px;}blockquote, ol, ul{margin-top:0px;margin-bottom:0px;}</style></head>

<body><div style="DISPLAY: block; FONT-SIZE: 12px; FONT-FAMILY: Arial"><P>Hi all!</P>
<P>&nbsp;</P>
<P>Have been testing OpenAFS 1.4.0 on AIX 5.3. When our two AFS servers were installed we had the system dump on one of the servers while the disk cache was being set up, but since this didn't happen on the other server and it didn't happen after we removed and reinstalled AFS again&nbsp;we didn't look any deeper into it.</P>
<P>&nbsp;</P>
<P>Then we had the systems running for&nbsp;a few days, not doing much testing, when&nbsp;suddenly we desperately needed the servers for testing a SAP application setup. We didn't think it should be any problem running SAP while the AFS servers were still running on the servers, the servers&nbsp;are big enough and SAP was installed&nbsp;having its own filesystems, not using anything that had to do with AFS (not the AFS filesystem, no authentication through AFS etc), but everytime we tried to start SAP&nbsp;the system dumped and looking in AIX 'errpt'&nbsp;we saw the problem reported was the same that had been reported the first time when we had the crash during the installation. The message reported everytime was:</P>
<P>&nbsp;</P>
<P>---------------------------------------------------------------------------<BR>LABEL:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DSI_PROC<BR>IDENTIFIER:&nbsp;&nbsp;&nbsp;&nbsp; 9D035E4D</P>
<P>Date/Time:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Mon&nbsp;Mar&nbsp;20 16:22:30 2006<BR>Sequence Number: 77<BR>Machine Id:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; XXXXXXXXX<BR>Node Id:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; xxxxxxxx<BR>Class:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; S<BR>Type:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; PERM<BR>Resource Name:&nbsp;&nbsp; SYSVMM</P>
<P>Description<BR>DATA STORAGE INTERRUPT, PROCESSOR</P>
<P>Probable Causes<BR>SOFTWARE PROGRAM</P>
<P>Failure Causes<BR>SOFTWARE PROGRAM</P>
<P>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Recommended Actions<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; IF PROBLEM PERSISTS THEN DO THE FOLLOWING<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CONTACT APPROPRIATE SERVICE REPRESENTATIVE</P>
<P>Detail Data<BR>DATA STORAGE INTERRUPT STATUS REGISTER<BR>0000 0000 0000 0000<BR>SEGMENT REGISTER, SEGREG<BR>0A00 0000 0000 0000<BR>DATA STORAGE INTERRUPT ADDRESS REGISTER<BR>0000 0400 0000 0000<BR>EXVAL<BR>0000 0004 0000 0000<BR>---------------------------------------------------------------------------<BR>&nbsp;</P>
<P>&nbsp;</P>
<P>I searched the openafs maillist archives&nbsp;for anything resembling this and found only &nbsp;Niklas Edmundsson's mail from September 2005, which I assume is from before the 1.4.0 official release, and after that nothing.</P>
<P>&nbsp;</P>
<P>After we stopped AFS on the servers we had no problem start SAP, so it seems something during the SAP startup procedure&nbsp;got AFS out of balance causing the whole system to crash...&nbsp;&nbsp;</P>
<P>&nbsp;</P>
<P>Is there anyone out there running 1.4.0 on AIX 5.3 that experienced anything like this? For sure, running SAP systems on an AFS server&nbsp;can't be considered a&nbsp;normal setup, but the fact that an application at all can cause AFS crash the system this way makes at least me wonder...</P>
<P>&nbsp;</P>
<P>Regards, Jonas Andersen SCA</P></br></div></body></html>
--=_NextPart_Lycos_195611702109295_ID--