[OpenAFS] afsd causes crash for openafs1.2 and kernel 2.4.7 (fwd)

Warren.Yenson@morganstanley.com Warren.Yenson@morganstanley.com
Fri, 21 Sep 2001 18:23:42 -0400 (EDT)


We have seen that we can repeatedly crash a Linux box running 2.4.7 and
OpenAFS 1.2 by doing operations in /afs that open a large number of files
or directories (e.g. du -sk).

The message on the screen is:

  Sep 20 12:58:07 saloon1 kernel: Increase -stat parameter of afsd(VLRU cycle?)<1> Unable to handle kernel paging request at virtual address
fffffff

We started with the default of 2800 for a large model, and have tried
various values up to 10000 and still seen the fault.

Since we don't see this on our regular (Transarc) AFS on Solaris I'm
wondering if anyone knows of some kind of leak in this version of OpenAFS.

Here's what afsd is starting with:

root       434     1  0 13:04 ?        00:00:00 /etc/vice/etc/afsd
-volumes 70 -mountdir /ms -daemons 3 -nosettime -cachedir /afscache
-rootvol root.afs -logfile /var/tmp/AFSLog -dcache 800 -stat 3000

And here are the log messages when the machine hangs:

Sep 20 12:58:07 saloon1 kernel: Increase -stat parameter of afsd(VLRU
cycle?)<1> Unable to handle kernel paging request at virtual address
ffffffff
Sep 20 12:58:07 saloon1 kernel:  printing eip:
Sep 20 12:58:07 saloon1 kernel: e8982278
Sep 20 12:58:07 saloon1 kernel: *pde = 0804cc09
Sep 20 12:58:07 saloon1 kernel: *pte = 00000000
Sep 20 12:58:07 saloon1 kernel: Oops: 0002
Sep 20 12:58:07 saloon1 kernel: CPU:    1
Sep 20 12:58:07 saloon1 kernel: EIP:    0010:[openafs:osi_Panic+40/48]
Sep 20 12:58:07 saloon1 kernel: EIP:    0010:[<e8982278>]
Sep 20 12:58:07 saloon1 kernel: EFLAGS: 00010286
Sep 20 12:58:07 saloon1 kernel: eax: 0000002d   ebx: d041d600   ecx:
00000086   edx: d7209f64
Sep 20 12:58:07 saloon1 kernel: esi: e89b1204   edi: 00000000   ebp:
d041d724   esp: d4891d0c
Sep 20 12:58:07 saloon1 kernel: ds: 0018   es: 0018   ss: 0018
Sep 20 12:58:07 saloon1 kernel: Process du (pid: 1847, stackpage=d4891000)
Sep 20 12:58:07 saloon1 kernel: Stack: e895b601 e89a09a0 00000000 00000010
00000246 00000001 00001c2d 00000000
Sep 20 12:58:07 saloon1 kernel:        00000000 00001770 00000022 e8aedcfc
e8aed cfc00000005 00001772 cffc0400
Sep 20 12:58:07 saloon1 kernel:        d4891e50 00000000 00000000 00000000 e895d
182 d4891e70 00000000 00000001
Sep 20 12:58:07 saloon1 kernel: Call Trace: [openafs:afs_NewVCache+837/2016] [op
enafs:__insmod_openafs_S.rodata_L2008+8832/36000]
[e100:adapters_proc_dir+1762332/218288124] [e100:adapters_proc_dir+1762332/218288124]
[openafs:afs_GetVCache+290/1256]
Sep 20 12:58:07 saloon1 kernel: Call Trace: [<e895b601>] [<e89a09a0>]
[<e8aedcfc>] [<e8aedcfc>] [<e895d182>]
Sep 20 12:58:07 saloon1 kernel:    [openafs:afs_dir_GetBlob+24/52]
[openafs:DirHash+165/312] [openafs:afs_dir_LookupOffset+111/124]
[openafs:afs_lookup+2242/3456] [openafs:afs_AccessOK+58/336] [openafs:afs_access+390/684]
Sep 20 12:58:07 saloon1 kernel:    [<e8951a94>] [<e8951b55>] [<e8951883>]
[<e8966a7a>] [<e895fb52>] [<e895fdee>]
Sep 20 12:58:07 saloon1 kernel:    [openafs:afs_linux_lookup+104/392]
[d_alloc+25/384] [real_lookup+115/272] [path_walk+1646/2288] [__user_walk+58/96]
[sys_lstat64+19/112]
Sep 20 12:58:07 saloon1 kernel:    [<e898ec54>] [<c0151f29>] [<c0148b33>]
[<c014937e>] [<c0149bea>] [<c0146343>]
Sep 20 12:58:07 saloon1 kernel:    [system_call+51/56]
Sep 20 12:58:07 saloon1 kernel:    [<c010716b>]
Sep 20 12:58:07 saloon1 kernel:
Sep 20 12:58:07 saloon1 kernel: Code: c6 05 ff ff ff ff 2a c3 55 57 56 53
8b 74 24 18 83 fe 01 8b