[OpenAFS] Re: Rx call failed during dump, error -01

Adam Megacz megacz@cs.berkeley.edu
Fri, 31 Aug 2007 20:40:14 -0700


Hrm, further weirdness.  This time there's no Rx error, but "vos dump"
simply sits there at 0% cpu utilization.  Worse, something appears to
be unhappy in the afs client; I did this

  $ aklog -c megacz.com
  aklog: Couldn't get megacz.com AFS tickets:
  aklog: Credentials cache I/O operation failed XXX while getting AFS tickets
  $ aklog -c megacz.com
  *** glibc detected *** double free or corruption (!prev): 0x0809e1e0 ***
  Aborted

  (gdb) bt
  #0  0xb7d9c947 in raise () from /lib/tls/libc.so.6
  #1  0xb7d9e0c9 in abort () from /lib/tls/libc.so.6
  #2  0xb7dd216a in __fsetlocking () from /lib/tls/libc.so.6
  #3  0xb7dd9a2f in mallopt () from /lib/tls/libc.so.6
  #4  0xb7dd9ad2 in free () from /lib/tls/libc.so.6
  #5  0xb7f303db in krb5_free_cred_contents () from /usr/lib/libkrb5.so.3
  #6  0xb7f14127 in krb5_get_notification_message () from /usr/lib/libkrb5.so.3
  #7  0xb7f1c507 in krb5_cc_next_cred () from /usr/lib/libkrb5.so.3
  #8  0xb7f10566 in krb5int_cc_creds_match_request () from /usr/lib/libkrb5.so.3
  #9  0xb7f107ca in krb5_cc_retrieve_cred_default () from /usr/lib/libkrb5.so.3
  #10 0xb7f16089 in krb5_get_notification_message () from /usr/lib/libkrb5.so.3
  #11 0xb7f1c47e in krb5_cc_retrieve_cred () from /usr/lib/libkrb5.so.3
  #12 0xb7f2aed6 in krb5_get_credentials () from /usr/lib/libkrb5.so.3
  
I should note that we are performing "vos dump" on the fileserver
machine itself, and the dump output is being piped to a directory in
/afs/ which is part of a different cell.  Bizarre, yes, but this
should work, right?

I've sent the tcpdump to Derrick.

  - a




Derrick J Brashear <shadow@dementia.org> writes:
> On Wed, 29 Aug 2007, Adam Megacz wrote:
>
>>
>> Derrick J Brashear <shadow@dementia.org> writes:
>>>> We're getting this consistently when attempting to dump a particular
>>>> volume (~750mb size).  Has anybody seen this before?
>>
>>> dump, or move?
>>
>> vos dump.
>>
>>>> Wed Aug 29 02:20:30 2007 trans 2158 on volume 536879780 is older than 690 seconds
>>>> Wed Aug 29 02:20:35 2007 1 Volser: DumpVolume: Rx call failed during dump, error -01
>>
>>> tcpdump?
>>
>> What flags would you like?
>
> -x -s 1500 port 7005, probably. i just want to see the error code in
> the abort.

-- 
PGP/GPG: 5C9F F366 C9CF 2145 E770  B1B8 EFB1 462D A146 C380