[OpenAFS-devel] Problems on 8-way Itanium2 system

Alf Wachsmann alfw@slac.stanford.edu
Fri, 28 Jan 2005 15:45:51 -0800 (PST)


Hi,

we have problems with the stability of AFS on our 72 CPU SGI Altix
system running a 2.4.21-sgi302r24 kernel with OpenAFS-1.2.10.

We are able to reproduce the hangs on an 8 CPU configuration.
A backtrace show something like this:

Stack traceback for pid 1868
0xe0000030f7ea8000     1868        1  1    6   R  0xe0000030f7ea85a0 *afs_background
0xe000000004408c20 ia64_spinlock_contention+0x20
        args (0xe00008b4f5cfc780, 0x1, 0xe00008b4f5cfc9f4, 0xe00008b4f5cfc9f0, 0xe0000030f7eafc40)
        kernel .text 0xe000000004400000 0xe000000004408c00 0xe000000004408c50
        r31 (spinlock address) 0xe000000004ea4380 kernel_flag
0xe0000000045f03b0 ext2_new_block+0x11d0
        kernel .text 0xe000000004400000 0xe0000000045ef1e0 0xe0000000045f0d80
0xe00000000515b200 sn_latency_matrix+0x5d570
        args (0xe00005b4f2ece000, 0x1000, 0xe0000030f4496408, 0x534f2370000, 0xa0007fed45f752b0)
        kernel .bss 0xe000000005067500 0xe0000000050fdc90 0xe0000000058fdc90
0xe0000000045f0470 ext2_new_block+0x1290
        args (0xa00000003abaca08, 0xa00000003abc0ab0, 0x80000, 0xa00000003b0ba3f8, 0xa00000003abc0d30)
        kernel .text 0xe000000004400000 0xe0000000045ef1e0 0xe0000000045f0d80

Has anyone else seen this problem? Is there a fix or workaround?

Many thanks,
                Alf.

-----------------------------------------------------------------------
  Alf Wachsmann                       | e-mail: alfw@slac.stanford.edu
  SLAC Computing Service              | Phone:  +1-650-926-4802
  2575 Sand Hill Road, M/S 97         | FAX:    +1-650-926-3329
  Menlo Park, CA 94025, USA           | Office: Bldg. 50/323
-----------------------------------------------------------------------
                http://www.slac.stanford.edu/~alfw (PGP)
-----------------------------------------------------------------------