[OpenAFS-devel] 1.3.79 on AIX 5.2, system dump when using token

Hartmut Reuter reuter@rzg.mpg.de
Fri, 25 Feb 2005 15:05:46 +0100


Michael Niksch wrote:
>> Michael: You might want to try my patch and see how much things 
>> improve for you.
> 
> 
> I'll try that.
> 
>> Given that the thing works better with larger stack I'd assume that 
>> some assumption that's valid on AIX4 isn't valid now which causes the 
>> thing to overwrite memory in kernel space and things goes downhill 
>> from there.
> 
> 
> I compiled 1.3.79 also on AIX 4.3.3, where I have been using 1.2.10 
> without problems for a long time now. With 1.3.79, however, I continue 
> to have the problem that klog dies with 'Illegal instruction' as it did 
> already with previous version of 1.3.xx. If I use klog from 1.2.10, I 
> can obtain a token, but after doing so, the machine dies just like the 
> AIX 5.2 machine, even though kdb stat (attached) looks different.
> 

There are probably two problems:

1) klog does not work unless you compiled it without -O (I think 
specially the des libraries, but better if you do it for all). This is 
obviously a bug of the IBM compiler.

2) afs_pioctl.c had been changed between 1.2 1nd 1.3 at some point to 
support the giant tickets created by Microsoft's active directory (12000 
byte). AIX has a problem with this. Therefore I came back to the old 
code which limits the ticket size to e reasonable value.

My version in /afs/ipp-garching.mpg.de/openmrafs/openafs is based mostly 
on 1.3.77 and works correctly on AIX 5.2. Feel free to copy.

Hartmut Reuter


> 
> ------------------------------------------------------------------------
> 
> vmcore.24 mapped from @ 70000000 to @ 72f3e200
> Preserving 582420 bytes of symbol table
> First symbol __mulh
> KERNEXT FUNCTION NAME CACHE (90112 bytes) allocated
> KERNEXT COMMANDS SPACE (4096 bytes) allocated
> Component Names:
>  1)  proc [126 entries]
>  2)  thrd [216 entries]
>  3)  ldr [1 entries]
>  4)  errlg [3 entries]
>  5)  bos [7 entries]
>  6)  ipc [7 entries]
>  7)  vmm [19 entries]
>  8)  sscsidd [1 entries]
>  9)  scdisk [5 entries]
> 10)  lvm [2 entries]
> 11)  tty [4 entries]
> 12)  netstat [10 entries]
> 13)  phxent_dd [5 entries]
> 14)  bldd [5 entries]
> Component Dump Table has 411 entries
>            START              END <name>
> 0000000000003500 00000000010648E0 _system_configuration+000020
> 000000002FF3B400 000000002FF7E428 __ublock+000000
> 000000002FF22FF4 000000002FF22FF8 environ+000000
> 000000002FF22FF8 000000002FF22FFC errno+000000
> 00000000E0000000 00000000F0000000 ameseg+10000000
> PFT:
> id....................0007
> raddr.....0000000001000000 eaddr.....0000000001000000
> size..............00400000 align.............00400000
> valid..1 ros....0 holes..0 io.....0 seg....1 wimg...2
> 
> PVT:
> id....................0008
> raddr.....0000000000228000 eaddr.....0000000000228000
> size..............00080000 align.............00001000
> valid..1 ros....0 holes..0 io.....0 seg....1 wimg...2
> Dump analysis on CHRP_UP_PCI POWER_PC POWER_604 machine with 1 cpu(s)
> Processing symbol table...
> .......................done
> (0)> stat
> CHRP_UP_PCI POWER_PC POWER_604 machine with 1 cpu(s)
> .......... SYSTEM STATUS
> sysname... AIX        nodename.. finsteraar
> release... 3          version... 4         
> machine... 0048C8CA4C nid....... 48C8CA4C
> time of crash: Fri Feb 25 14:07:44 2005
> age of system: 18 min., 32 sec.
> .......... CPU 0 CSA 2FF3B400 at time of crash, error code for LEDs: 30000000
> thread+004900 STACK:
> [050CD1FC]rxi_NewCall+0000BC (30C92020, 00000000)
> [050CED18]rx_NewCall+0001C0 (30C92020)
> [05112234]RXAFS_FetchStatus+00002C (30C92020, 30D5FE8C, 2FF3B040, 2FF3AF78,
>    2FF3AF90)
> [0510B6AC]afs_FetchStatus+0000F4 (??, ??, ??, ??)
> [05129F0C]afs_GetAccessBits+0001B0 (30D5FE48, 00000008, 2FF3B188)
> [05129B64]afs_AccessOK+00006C (30D5FE48, 00000008, 2FF3B188, 00000001)
> [051298E0]afs_access+0002C8 (30D5FE48, 00000001, 30002400)
> [0513722C]afs_gn_access+000060 (30D5FE48, 00000001, 00000000, 30002400)
> [05131C50]vn_access+00008C (30D5FE48, 00000001, 00000000, 30002400)
> [001684D8]vnop_access+000018 (??, ??, ??, ??)
> [00160270]chdirec+000068 (??, ??)
> [00160108]chdir+000120 (??)
> [00003A48].sys_call+000000 ()
> page not in dump @ 1001EEF8
> (0)> q


-- 
-----------------------------------------------------------------
Hartmut Reuter                           e-mail reuter@rzg.mpg.de
					   phone +49-89-3299-1328
RZG (Rechenzentrum Garching)               fax   +49-89-3299-1301
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------