[OpenAFS] 2.4.x afsd shutdown oops when processes still accessing /afs

Rudolph T Maceyko rtm@cert.org
Tue, 09 Oct 2001 15:39:42 -0400


--==========809195387==========
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

When processes are still accessing afs (e.g., cwd is /afs/...) at the 
time a shutdown is occurring things get messy.

The afs shutdown fails to umount /afs (EBUSY) and fails to unload the 
module (EBUSY).  Then several more umount attempts fail (still EBUSY), 
the process gets killed and an oops results for any subsequent umount 
/afs.  (See attached.)

I've noticed this under Red Hat 7.1 with Red Hat's "stock" 2.4.[367] 
kernels and OpenAFS 1.2.1.  I have not tried other distributions or 
kernels.  I don't recall the "oops" behavior happening under these 
circumstances with OpenAFS prior to 1.2.1.

Ordinarily the problem happens at shutdown time sufficiently late that 
klogd isn't still running so there's nothing to run ksymoops on, so 
I've reproduced the problem artificially.

(Make sure a process has cwd of /afs/...)

# /etc/init.d/afs stop
Stopping AFS services.....
inetd.afs: no process killed
umount: /afs: device is busy
libafs-2.4.7-2-i686: Device or resource busy

(kill the process whose cwd was /afs/...)

# umount /afs

(watch oops as it occurs)

When this happens at shutdown a power-cycle is required.

I wish I had the expertise to track this down further myself, but alas 
I can't do that without guidance.  The problem is about 90% 
reproducible as shown above, and decreasing as I continue to exersize 
the problem...  I don't know why sometimes I don't get an oops.

When doing it by hand, make sure to umount /afs at the end and rmmod 
libafs or the box will hang unmounting filesystems when you shutdown.

Rudy

--==========809195387==========
Content-Type: text/plain; charset=iso-8859-1; name="ksymoops.txt"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="ksymoops.txt"; size=6681

ksymoops 2.4.0 on i686 2.4.7-2.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.7-2/ (default)
     -m /boot/System.map-2.4.7-2 (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Warning (compare_maps): ksyms_base symbol =
__VERSIONED_SYMBOL(ataraid_gendisk) not found in System.map.  Ignoring =
ksyms_base entry
Warning (compare_maps): ksyms_base symbol =
__VERSIONED_SYMBOL(ataraid_get_device) not found in System.map.  =
Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol =
__VERSIONED_SYMBOL(ataraid_register_disk) not found in System.map.  =
Ignoring ksyms_base entry
Warning (compare_maps): ksyms_base symbol =
__VERSIONED_SYMBOL(ataraid_release_device) not found in System.map.  =
Ignoring ksyms_base entry
Warning (compare_maps): mismatch on symbol partition_name  , ksyms_base =
says c01b7600, System.map says c0157550.  Ignoring ksyms_base entry
Warning (compare_maps): mismatch on symbol usb_devfs_handle  , usbcore =
says e106f5a0, /lib/modules/2.4.7-2/kernel/drivers/usb/usbcore.o says =
e106f0c0.  Ignoring /lib/modules/2.4.7-2/kernel/drivers/usb/usbcore.o =
entry
Oct  8 10:22:38 gti kernel: 3c59x: Donald Becker and others. =
www.scyld.com/network/vortex.html
Oct  8 10:41:51 gti kernel: 3c59x: Donald Becker and others. =
www.scyld.com/network/vortex.html
Oct  8 15:19:40 gti kernel: ac97_codec: AC97 Audio codec, id: =
0x8384:0x7609 (SigmaTel STAC9721/23)
Oct  8 15:46:19 gti kernel: ac97_codec: AC97 Audio codec, id: =
0x8384:0x7609 (SigmaTel STAC9721/23)
Oct  8 17:55:05 gti kernel: 3c59x: Donald Becker and others. =
www.scyld.com/network/vortex.html
Oct  8 17:58:18 gti kernel: 3c59x: Donald Becker and others. =
www.scyld.com/network/vortex.html
Oct  8 18:59:03 gti kernel: 3c59x: Donald Becker and others. =
www.scyld.com/network/vortex.html
Oct  8 21:31:26 gti kernel: ac97_codec: AC97 Audio codec, id: =
0x8384:0x7609 (SigmaTel STAC9721/23)
Oct  8 23:48:39 gti kernel: ac97_codec: AC97 Audio codec, id: =
0x8384:0x7609 (SigmaTel STAC9721/23)
Oct  9 10:15:22 gti kernel: 3c59x: Donald Becker and others. =
www.scyld.com/network/vortex.html
Oct  9 10:41:06 gti kernel: kernel BUG at slab.c:1570!
Oct  9 10:41:06 gti kernel: invalid operand: 0000
Oct  9 10:41:06 gti kernel: CPU:    0
Oct  9 10:41:06 gti kernel: EIP:    0010:[kmem_cache_free+87/704]
Oct  9 10:41:06 gti kernel: EIP:    0010:[<c012dca7>]
Using defaults from ksymoops -t elf32-i386 -a i386
Oct  9 10:41:06 gti kernel: EFLAGS: 00010282
Oct  9 10:41:06 gti kernel: eax: 0000001b   ebx: e11c1fe0   ecx: =
00000001   edx: 00001f86
Oct  9 10:41:06 gti kernel: esi: 008cb744   edi: e11c1fe0   ebp: =
00000001   esp: dc79feb4
Oct  9 10:41:06 gti kernel: ds: 0018   es: 0018   ss: 0018
Oct  9 10:41:06 gti kernel: Process umount (pid: 1296, =
stackpage=3Ddc79f000)
Oct  9 10:41:06 gti kernel: Stack: c021b79c 00000622 c1b10134 c01264b5 =
e11c1fe0 e11c1fe0 e11c1fe0 00000001=20
Oct  9 10:41:06 gti kernel:        c014a2fa c1b8832c e11c1fe0 e11c1fe0 =
c014b96b e11c1fe0 dd4d955c 00000246=20
Oct  9 10:41:06 gti kernel:        df5c4394 df5bde04 df1cd3ac df1cd394 =
c01495c3 e11c1fe0 e11c1fe0 c0141690=20
Oct  9 10:41:06 gti kernel: Call Trace: [copyrite+16636/36756] =
[handle_mm_fault+101/224] [<e11c1fe0>] [<e11c1fe0>] [<e11c1fe0>]=20
Oct  9 10:41:06 gti kernel: Call Trace: [<c021b79c>] [<c01264b5>] =
[<e11c1fe0>] [<e11c1fe0>] [<e11c1fe0>]=20
Oct  9 10:41:06 gti kernel:    [<c014a2fa>] [<e11c1fe0>] [<e11c1fe0>] =
[<c014b96b>] [<e11c1fe0>] [<c01495c3>]=20
Oct  9 10:41:06 gti kernel:    [<e11c1fe0>] [<e11c1fe0>] =
[cached_lookup+16/80] [<e113fd60>] [<e113fd2c>] =
[shrink_dcache_parent+22/48]=20
Oct  9 10:41:06 gti kernel:    [<e11c1fe0>] [<e11c1fe0>] [<c0141690>] =
[<e113fd60>] [<e113fd2c>] [<c01498c6>]=20
Oct  9 10:41:06 gti kernel:    [<c013caea>] [<c0141678>] [<c013cf7a>] =
[<c0136f23>] [<c01275d2>] [<c013cfbc>]=20
Oct  9 10:41:06 gti kernel:    [<c0106f0b>]=20
Oct  9 10:41:06 gti kernel: Code: 0f 0b 58 5a 8b 15 4c 14 2d c0 eb 0d =
90 90 90 90 90 90 90 90=20

>>EIP; c012dca7 <kmem_cache_free+57/2c0>   <=3D=3D=3D=3D=3D
Trace; c021b79c <copyrite+40fc/8f94>
Trace; c01264b5 <handle_mm_fault+65/e0>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; c014a2fa <destroy_inode+2a/30>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; c014b96b <iput+19b/1b0>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; c01495c3 <prune_dcache+d3/160>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; e11c1fe0 <END_OF_CODE+73321/????>
Trace; c0141690 <cached_lookup+10/50>
Trace; e113fd60 <[libafs-2.4.7-2-i686]rcsid+40/120>
Trace; e113fd2c <[libafs-2.4.7-2-i686]rcsid+c/120>
Trace; c01498c6 <shrink_dcache_parent+16/30>
Trace; c013caea <kill_super+4a/100>
Trace; c0141678 <path_release+28/30>
Trace; c013cf7a <sys_umount+da/110>
Trace; c0136f23 <filp_close+53/60>
Trace; c01275d2 <sys_munmap+32/50>
Trace; c013cfbc <sys_oldumount+c/10>
Trace; c0106f0b <system_call+33/38>
Code;  c012dca7 <kmem_cache_free+57/2c0>
00000000 <_EIP>:
Code;  c012dca7 <kmem_cache_free+57/2c0>   <=3D=3D=3D=3D=3D
   0:   0f 0b                     ud2a      <=3D=3D=3D=3D=3D
Code;  c012dca9 <kmem_cache_free+59/2c0>
   2:   58                        pop    %eax
Code;  c012dcaa <kmem_cache_free+5a/2c0>
   3:   5a                        pop    %edx
Code;  c012dcab <kmem_cache_free+5b/2c0>
   4:   8b 15 4c 14 2d c0         mov    0xc02d144c,%edx
Code;  c012dcb1 <kmem_cache_free+61/2c0>
   a:   eb 0d                     jmp    19 <_EIP+0x19> c012dcc0 =
<kmem_cache_free+70/2c0>
Code;  c012dcb3 <kmem_cache_free+63/2c0>
   c:   90                        nop   =20
Code;  c012dcb4 <kmem_cache_free+64/2c0>
   d:   90                        nop   =20
Code;  c012dcb5 <kmem_cache_free+65/2c0>
   e:   90                        nop   =20
Code;  c012dcb6 <kmem_cache_free+66/2c0>
   f:   90                        nop   =20
Code;  c012dcb7 <kmem_cache_free+67/2c0>
  10:   90                        nop   =20
Code;  c012dcb8 <kmem_cache_free+68/2c0>
  11:   90                        nop   =20
Code;  c012dcb9 <kmem_cache_free+69/2c0>
  12:   90                        nop   =20
Code;  c012dcba <kmem_cache_free+6a/2c0>
  13:   90                        nop   =20


7 warnings issued.  Results may not be reliable.

--==========809195387==========--