[OpenAFS-devel] Crashes on AIX6 shutdown

Niklas Edmundsson Niklas.Edmundsson@hpc2n.umu.se
Mon, 23 Mar 2009 11:41:33 +0100 (MET)


Hi all!

I finally managed to get some clue on where OpenAFS 1.4.8 sometimes 
seems to crash upon shutdown of my AIX6 boxes.

The dump is slighly botched, but at least it gives a hint:

---------------------8<----------------------
Address at fault was 0xF100060023FD8570

CRASH INFORMATION:
CPU 0 CSA F00000002FF47600 at time of crash, error code for LEDs: 70000000
pvthread+00F300 STACK:
[00019D40]abend_trap+000000 ()
[0047FB18]xmdbg_do_xmfree_record+000518 (F100060023FD8570, F100060000000108,
    F100060000008B20, 000013FD000013FD, F00000002FF46F88)
[0047CD84]xmfree1+000264 (F100060023FD8570, F100060000000108,
    0000000013FD8570, F100060000000380, 0000000000000000, 0000000B00000003,
    0000000000000000, 0000000000000000)
[0047DD70]xmfree+0001F0 (??, ??)
[F10000009073D12C]afs_osi_Free+00004C (??, ??)
[F100000090782FD4]shutdown_vcache+0000F4 ()
[F1000000907DA440]shutdown_cache+000060 ()
[F100000090756E94]afs_shutdown+000294 ()
[F100000090778B20]afs_unmount+0000A0 (F100060016290D08, 0000000000000000)
[F10000009076F944]vfs_unmount+0000A4 (F100060016290D08, 0000000000000000,
    F1000600233A287C)
[00014D70].hkey_legacy_gate+00004C ()
[0054CC34]vfs_unmount+000094 (??, ??, ??)
[004C3844]kunmount+0000A4 (??, ??, ??)
[004C4200]uvmount+000200 (??, ??)
[00003814].svc_instr+000114 ()
---------------------8<----------------------

Any ideas? A classic double free, or something more serious?

I know that AIX6 has some new kernel memory protection scheme they 
call "Storage-Protection Keys" that compartmentalises memory accesses 
and thus will be more likely to expose bugs. I have no clue on whether 
it provides some protection by default, or if this support has to be 
added/enabled for it to work.

http://publib.boulder.ibm.com/infocenter/systems/index.jsp?topic=/com.ibm.aix.kernelext/doc/kernextc/kernelkey.htm 
seems to be the relevant docco-page.

According to the docco-page support must be added, but I strongly 
suspect that the support for the legacy way of doing stuff has some 
form of compartment set up...

Personally I think that this would be a good time for IBM to chime in 
and provide som patches ;)


/Nikke - rambling in a random direction, means lunch is needed...
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se     |    nikke@hpc2n.umu.se
---------------------------------------------------------------------------
  KARAOKE: A Japanese word meaning tone deaf.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=