[OpenAFS] afs and tru64 - Hanging processes

Padiyath Sreekumaran Kumar.Padiyath@psi.ch
Mon, 1 Dec 2003 12:14:13 +0100


  Hallo,
   I have installed IBM version(3.6) of AFS client on our Tru64 machines.
   (Tru64 OS 5.1A). The disc cache size is 130MB.
   cat /afscache/etc/cacheinfo 
   /afs:/usr/vice/cache:130000

   I have got a number of "Uninterruptible sleeping processes". WHen I 
   debug these processes I always endup with afs routines.
  
  Example:
  =======

root@psw283:/>dbx -k /vmunix
dbx version 5.1
Type 'help' for help.

stopped at  [thread_block:3230 ,0xfffffc00002ea270]      Source not
available

warning: Files compiled -g3: parameter values probably wrong
(dbx) set $pid=49211
(dbx) t
>  0 thread_block() ["../../../../src/kernel/kern/sched_prim.c":3230,
0xfffffc00002ea270]
   1 ubc_invalidate_lookup(0x0, 0xfffffc00a1014000, 0x1, 0x10040, 0x0)
["../../../../src/kernel/vfs/vfs_ubc.c":4948, 0xfffffc000067d92c]
   2 ubc_invalidate(0x500000010, 0x1, 0xfffffc00d5e30170,
0xfffffc00005778f0, 0xfffffc0092ca412c)
["../../../../src/kernel/vfs/vfs_ubc.c":5085, 0xfffffc000067db88]
   3 afs_ustrategy(abp = (unallocated - symbol optimized away), credp =
(unallocated - symbol optimized away)) ["../afs/afs_vnop_strategy.c":167,
0xfffffc00005a27a0]
   4 mp_afs_putpage(vop = (unallocated - symbol optimized away), pl =
(unallocated - symbol optimized away), pcnt = (unallocated - symbol
optimized away), flags = (unallocated - symbol optimized away), cred =
(unallocated - symbol optimized away)) ["../afs/osi_vnodeops.c":936,
0xfffffc00005d4e88]
   5 ubc_flush_dirty_age(0xfffffc00c95685e4, 0xfffffc00c95685a0,
0xfffffc00d5e30170, 0xfffffc00d5e30174, 0x1)
["../../../../src/kernel/vfs/vfs_ubc.c":5503, 0xfffffc000067edd8]
   6 osi_ubc_flush_dirty_and_wait(vp = (unallocated - symbol optimized
away), flags = (unallocated - symbol optimized away)) ["../afs/osi_vm.c":90,
0xfffffc00005d1890]
   7 osi_VM_StoreAllSegments(avc = (unallocated - symbol optimized away))
["../afs/osi_vm.c":130, 0xfffffc00005d19d4]
   8 afs_StoreAllSegments(avc = (unallocated - symbol optimized away), areq
= (unallocated - symbol optimized away), sync = (unallocated - symbol
optimized away)) ["../afs/afs_segments.c":204, 0xfffffc0000585b80]
   9 mp_afs_ubcrdwr(avc = (unallocated - symbol optimized away), uio =
(unallocated - symbol optimized away), ioflag = (unallocated - symbol
optimized away), cred = (unallocated - symbol optimized away))
["../afs/osi_vnodeops.c":657, 0xfffffc00005d3d88]
  10 vn_write(0xfffffc00002b5f00, 0xfffffe054528f878, 0xfffffc004cc083c0,
0x1, 0x19) ["../../../../src/kernel/vfs/vfs_vnops.c":1474,
0xfffffc0000685680]
  11 rwuio(0xfffffe0545288000, 0xfffffc00a1014000, 0xfffffc00ef8219c0,
0xfffffe054528f8f0, 0x1) ["../../../../src/kernel/bsd/sys_generic.c":2264,
0xfffffc00002b5f54]
More (n if no)?
  12 write(0x6cd9b, 0xfffffc0000000001, 0x0, 0x100000000,
0xffffffff00000002) ["../../../../src/kernel/bsd/sys_generic.c":2186,
0xfffffc00002b5de8]
  13 syscall(0x19, 0x0, 0x0, 0x1200bb214, 0x0)
["../../../../src/kernel/arch/alpha/syscall_trap.c":725, 0xfffffc00006e5f80]
  14 _Xsyscall(0x8, 0x3ff800d2828, 0x3ffc125cd40, 0x1, 0x1523eb000)
["../../../../src/kernel/arch/alpha/locore.s":1814, 0xfffffc00006e975c]



      I have got users running jobs more than 24 hrs. The validity of afs
token is 24 hrs. by default.
      I asked HP regarding this system hanging problem. According to them
the problem comes from AFS.
      The users are having problems to login. They are not getting the user
prompt back.
      I have to reboot the system to remove all these uninterruptable 
      processes. After reboot users can login.Is it possible to clear the
AFS cache without reboot?
      Whether any one has noticed this problem?Whether this problem is
something to do with the cache size?

      I have got 8 afs daemons running always:

    ps aux|grep afsd|cat -n
     1  root       9127  0.0  0.0 2.05M 176K pts/4    R  + 10:35:33
0:00.01 grep afsd
     2  root        866  0.0  0.0  976K  32K ??       U    07:53:42
0:00.00 /usr/vice/etc/afsd -stat 2000 -dcache 800 -daemons 3 -volumes 70
     3  root        861  0.0  0.0  976K  32K ??       U    07:52:33
0:00.23 /usr/vice/etc/afsd -stat 2000 -dcache 800 -daemons 3 -volumes 70
     4  root        859  0.0  0.0  976K  32K ??       U    07:52:33
0:00.21 /usr/vice/etc/afsd -stat 2000 -dcache 800 -daemons 3 -volumes 70
     5  root        858  0.0  0.0  976K  32K ??       U    07:52:33
0:00.01 /usr/vice/etc/afsd -stat 2000 -dcache 800 -daemons 3 -volumes 70
     6  root        857  0.0  0.0  976K  32K ??       U    07:52:33
0:00.41 /usr/vice/etc/afsd -stat 2000 -dcache 800 -daemons 3 -volumes 70
     7  root        856  0.0  0.0  976K  40K ??       U    07:52:33
0:00.01 /usr/vice/etc/afsd -stat 2000 -dcache 800 -daemons 3 -volumes 70
     8  root        860  0.0  0.0  976K  32K ??       U    07:52:33
0:00.22 /usr/vice/etc/afsd -stat 2000 -dcache 800 -daemons 3 -volumes 70

   with best regards,
    Kumar