[OpenAFS-devel] Crashes on AIX6 shutdown

Niklas Edmundsson Niklas.Edmundsson@hpc2n.umu.se
Mon, 23 Mar 2009 14:04:41 +0100 (MET)


On Mon, 23 Mar 2009, Niklas Edmundsson wrote:

>
> Hi all!
>
> I finally managed to get some clue on where OpenAFS 1.4.8 sometimes seems to 
> crash upon shutdown of my AIX6 boxes.
>
> The dump is slighly botched, but at least it gives a hint:
>
> ---------------------8<----------------------
> Address at fault was 0xF100060023FD8570
>
> CRASH INFORMATION:
> CPU 0 CSA F00000002FF47600 at time of crash, error code for LEDs: 70000000
> pvthread+00F300 STACK:
> [00019D40]abend_trap+000000 ()
> [0047FB18]xmdbg_do_xmfree_record+000518 (F100060023FD8570, F100060000000108,
>   F100060000008B20, 000013FD000013FD, F00000002FF46F88)
> [0047CD84]xmfree1+000264 (F100060023FD8570, F100060000000108,
>   0000000013FD8570, F100060000000380, 0000000000000000, 0000000B00000003,
>   0000000000000000, 0000000000000000)
> [0047DD70]xmfree+0001F0 (??, ??)
> [F10000009073D12C]afs_osi_Free+00004C (??, ??)
> [F100000090782FD4]shutdown_vcache+0000F4 ()
> [F1000000907DA440]shutdown_cache+000060 ()
> [F100000090756E94]afs_shutdown+000294 ()
> [F100000090778B20]afs_unmount+0000A0 (F100060016290D08, 0000000000000000)
> [F10000009076F944]vfs_unmount+0000A4 (F100060016290D08, 0000000000000000,
>   F1000600233A287C)
> [00014D70].hkey_legacy_gate+00004C ()
> [0054CC34]vfs_unmount+000094 (??, ??, ??)
> [004C3844]kunmount+0000A4 (??, ??, ??)
> [004C4200]uvmount+000200 (??, ??)
> [00003814].svc_instr+000114 ()
> ---------------------8<----------------------
>
> Any ideas? A classic double free, or something more serious?
>
> I know that AIX6 has some new kernel memory protection scheme they call 
> "Storage-Protection Keys" that compartmentalises memory accesses and thus 
> will be more likely to expose bugs. I have no clue on whether it provides 
> some protection by default, or if this support has to be added/enabled for it 
> to work.
>
> http://publib.boulder.ibm.com/infocenter/systems/index.jsp?topic=/com.ibm.aix.kernelext/doc/kernextc/kernelkey.htm 
> seems to be the relevant docco-page.
>
> According to the docco-page support must be added, but I strongly suspect 
> that the support for the legacy way of doing stuff has some form of 
> compartment set up...

See the "Key-unsafe kernel extension support" section in this page for 
some more info on this:

http://www.ibm.com/developerworks/aix/library/au-keykernext/index.html

Executive summary is that there is no additional protection for legacy 
code by default, glue code is transparently inserted to achieve this 
(hence the magic hkey_legacy_gate in the trace).

/Nikke
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se     |    nikke@hpc2n.umu.se
---------------------------------------------------------------------------
  Coffee - 2 sugars - cream - and aspirin.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=