[OpenAFS] crash on AIX 5.2

Hartmut Reuter reuter@rzg.mpg.de
Tue, 11 Jan 2005 17:45:33 +0100


Jeffrey Altman wrote:
> Hartmut Reuter wrote:
> 
>>
>> I am in the process of tracking down all differences between my good 
>> version and 1.3.77.
>>
>> I am now not very distant from 1.3.77, and at least one problem seems
>> to be the new code in afs_pioctl.c for get and set tokens along with
>> the huge ticket size introduced for compatibilty with active directory.
>> Keeping the old ticket size and the old code for tokens in afs_pioctl.c
>> results in a fairly stable client. At least I can get a token, make 
>> clean in the openafs-tree and make dest without crashing the system.
>> This is certainly not enough testing for putting it into production,
>> but a hint where the problem may be hidden.
>>
>> Hartmut
> 
> 
> We know the problem is in the set/get token code on AIX.  More then
> likely the stack is too small to support a 12000 byte object and it
> is getting blown away on AIX.  The question is:
> 
>   * where is this object that is located on the stack?
> 
> If you can find that, then you will have solved the bug.

Does not look like stack overflow. The crash always happens in xmalloc1:

(0)> f
pvthread+00A500 STACK:
[006021F0]xmalloc1+0007AC (0000000000000200, F10000E00C22E000,
    0000000000000000, F10000E00C22E000, 0000000000000400, F10000E03B964269,
    0000000000000002, 00000000003E4338 [??])
[00606B70]xmalloc+000208 (??, ??, ??)
[08E41978]afs_osi_Alloc+00005C (??)
[08EBC6DC]afs_HandlePioctl+0003D4 (0000000000000000, 800C5608800C5608,
    F00000002FF3A400, 0000000000000000, F00000002FF3A438)
[08EC74F8]afs_syscall_pioctl+000294 (0000000000000000, 800C5608800C5608,
    000000002FF21FC0, 0000000000000000)
[08E46000]syscall+0001A0 (0000001400000014, 0000000000000000,
    800C5608800C5608, 2FF21FC02FF21FC0, 0000000000000000, 2E6D70672E6D7067,
    0000008000000080)
[08E45DB8]lpioctl+000050 (0000000000000000, 800C5608800C5608,
    000000002FF21FC0, 0000000000000000)
[0000379C]sc_msr_2_point+000028 ()
Not a valid dump data area @ 2FF21CF0
(0)>

So there probably storage on the kernel heap was overwritten.

Hartmut

> 
> Jeffrey Altman
> 
> 


-- 
-----------------------------------------------------------------
Hartmut Reuter                           e-mail reuter@rzg.mpg.de
					   phone +49-89-3299-1328
RZG (Rechenzentrum Garching)               fax   +49-89-3299-1301
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------