[OpenAFS-devel] Kernel hangs on RH Linux systems

Alf Wachsmann alfw@SLAC.Stanford.EDU
Thu, 13 Jun 2002 15:46:40 -0700 (PDT)


We have some heavily used Red Hat 7.2 Linux dual P3 machines running
OpenAFS. These machines freeze on a daily basis in some sort of kernel
deadlock. We tried Red Hat kernels 2.4.9-13smp, 2.4.18-0.4smp,
2.4.18-3smp, 2.4.18-4smp (yeah, we are desperate) and OpenAFS versions
1.2.3 and 1.2.4 but the machine freeze no matter what.

The only thing that still works in this frozen state is a MagicSysRQ
with which we get the task list and a traceback for each task.

A typicall traceback from a hanging job looks like this cronjob which
schedules a Perl script. Everything for this job (the script, the Perl
binary, all Perl modules) resides in AFS.

15:06:33 cronjob       D C02F039C     0  5832   5831                     (NOTLB)
15:06:33 Call Trace: [<f8a30a60>] afs_global_lock [libafs-2.4.18-4smp.mp] 0x0
15:06:33 [<f8a30a60>] afs_global_lock [libafs-2.4.18-4smp.mp] 0x0
15:06:33 [<c0107849>] __down [kernel] 0x69
15:06:33 [<c01079f8>] __down_failed [kernel] 0x8
15:06:33 [<f8a30a60>] afs_global_lock [libafs-2.4.18-4smp.mp] 0x0
15:06:33 [<f8a09b69>] afs_symlink_filler [libafs-2.4.18-4smp.mp] 0x529
15:06:33 [<c014c715>] permission [kernel] 0x45
15:06:33 [<c014d4bb>] link_path_walk [kernel] 0x9db
15:06:33 [<c014c56e>] getname [kernel] 0x5e
15:06:33 [<c014dae3>] __user_walk [kernel] 0x33
15:06:33 [<c01499f7>] vfs_stat [kernel] 0x17
15:06:33 [<c0130950>] file_read_actor [kernel] 0x0
15:06:33 [<c0149fa1>] sys_stat64 [kernel] 0x11
15:06:33 [<c0141b4f>] sys_read [kernel] 0x10f
15:06:33 [<c0108c6b>] system_call [kernel] 0x33


I can gladly provide the same kind of tracback for all "afsd" processes.
Unfortunatly, this is all information we have about this problem.

Has anyone a clue what might be going on here?

Thanks,
             Alf.

-----------------------------------------------------------------------
  Alf Wachsmann                       | e-mail: alfw@slac.stanford.edu
  SLAC Computing Service              | Phone:  +1-650-926-4802
  2575 Sand Hill Road, M/S 97         | FAX:    +1-650-926-3329
  Menlo Park, CA 94025, USA           | Office: Bldg. 50/323
-----------------------------------------------------------------------
                http://www.slac.stanford.edu/~alfw (PGP)
-----------------------------------------------------------------------