[OpenAFS-devel] oops on AFS mount
John A. Goebel
jgoebel@SLAC.Stanford.EDU
Sat, 11 May 2002 11:35:45 -0700
--Boundary_(ID_zXq3eNO7YhTejwbpiX6J/w)
Content-type: text/plain; charset=us-ascii
Content-transfer-encoding: 7BIT
Content-disposition: inline
Hello,
We have a problem that _seems_ to be AFS related, although the oops
points to an EXT3 function. The problem happens both on openafs 1.2.3
and release canidate 2, 1.2.4, but not a local filesystem directory,
nor an NFS mount point. The crash also happens if the machine that the AFS
space resides is running Transarc AFS or OpenAFS under Linux.
The test method is to run bonnie++ on a directory in AFS space.
After doing a write that is large (>= 1 gig) in AFS space, when bonnie
calls an unlink to the created file, there is an oops, and the
system wedges. While doing a strace on the process I get this:
temp11 kernel: Assertion failure in journal_write_metadata_buffer() at
journal.c
:406: "buffer_jdirty(jh2bh(jh_in))"
The cache space is at most 60% used, the disk has plenty of space left,
and the network to the server side is fine. There is a simple log file
attached to this mail.
There was a similar problem the reported on the LKML, at least at the
functional level <http://linux.msede.com/lvm_mlist/archive/2000/11/0145.html>.
Attached is the oops, a messy ksymoops of the oops, an strace
on the running process, and a snapshot of the filesystem state just
before the crash.
Thanks for any help you can give.
John
##############################################
# John Goebel <jgoebel(at)slac.stanford.edu> #
# Stanford Linear Accelerator Center #
# 2575 Sand Hill Road, Menlo Park, CA 94025 #
############################################ #
--Boundary_(ID_zXq3eNO7YhTejwbpiX6J/w)
Content-type: text/plain; charset=us-ascii; NAME=report
Content-transfer-encoding: 7BIT
Content-disposition: attachment; filename=report
Message from syslogd@temp11 at Fri May 10 14:39:35 2002 ...
temp11 kernel: Assertion failure in journal_write_metadata_buffer() at journal.c:406: "buffer_jdirty(jh2bh(jh_in))"
temp11 login: Assertion failure in journal_write_metadata_buffer() at journal.c:406: "buffer_jdirty(jh2bh(jh_in))"
invalid operand: 0000
nfs lockd sunrpc libafs-2.4.18-4smp.mp e100 ext3 jbd sym53c8xx sd_mod scsi_mod
CPU: 0
EIP: 0010:[<f884f9b4>] Tainted: PF
EFLAGS: 00010282
EIP is at journal_write_metadata_buffer [jbd] 0x74 (2.4.18-4smp)
eax: 0000001d ebx: 00000000 ecx: c02eeec0 edx: 00003552
esi: 00000000 edi: f6a507e0 ebp: d3583400 esp: f6c75e44
ds: 0018 es: 0018 ss: 0018
Process kjournald (pid: 145, stackpage=f6c75000)
Stack: f8854181 00000196 000015f4 f6b06c00 00000000 00000000 f7fe3520 00000000
f6a507e0 d3583400 f884cdf4 f6a507e0 f7fe3520 f6c75e98 000017f0 50195cfc
00000000 00000fd4 e4d3f02c 00000003 f6a507e0 db272cd0 000017f0 c010de61
Call Trace: [<f8854181>] .rodata.str1.1 [jbd] 0x4e1
[<f884cdf4>] journal_commit_transaction [jbd] 0x7e4
[<c010de61>] timer_interrupt [kernel] 0xb1
[<f884f806>] kjournald [jbd] 0x136
[<f884f6b0>] commit_timeout [jbd] 0x0
[<c0107286>] kernel_thread [kernel] 0x26
[<f884f6d0>] kjournald [jbd] 0x0
Code: 0f 0b 5e 5f 8b 7c 24 28 8b 4f 0c 85 c9 74 2e c7 44 24 0c 01
--Boundary_(ID_zXq3eNO7YhTejwbpiX6J/w)
Content-type: text/plain; charset=us-ascii; NAME=report.oops
Content-transfer-encoding: 7BIT
Content-disposition: attachment; filename=report.oops
ksymoops 2.4.1 on i686 2.4.18-4smp. Options used
-V (default)
-k /proc/ksyms (specified)
-l /proc/modules (default)
-o /lib/modules/2.4.18-4smp/ (default)
-m /boot/System.map-2.4.18-4smp (default)
Error (expand_objects): cannot stat(/lib/ext3.o) for ext3
Error (expand_objects): cannot stat(/lib/jbd.o) for jbd
Error (expand_objects): cannot stat(/lib/sym53c8xx.o) for sym53c8xx
Error (expand_objects): cannot stat(/lib/sd_mod.o) for sd_mod
Error (expand_objects): cannot stat(/lib/scsi_mod.o) for scsi_mod
Warning (compare_maps): mismatch on symbol partition_name , ksyms_base says c01cdad0, System.map says c0168570. Ignoring ksyms_base entry
Warning (compare_maps): mismatch on symbol nlmsvc_ops , lockd says f8d69cd0, /lib/modules/2.4.18-4smp/kernel/fs/lockd/lockd.o says f8d69134. Ignoring /lib/modules/2.4.18-4smp/kernel/fs/lockd/lockd.o entry
Warning (compare_maps): mismatch on symbol nfs_debug , sunrpc says f8d5af04, /lib/modules/2.4.18-4smp/kernel/net/sunrpc/sunrpc.o says f8d5abe4. Ignoring /lib/modules/2.4.18-4smp/kernel/net/sunrpc/sunrpc.o entry
Warning (compare_maps): mismatch on symbol nfsd_debug , sunrpc says f8d5af08, /lib/modules/2.4.18-4smp/kernel/net/sunrpc/sunrpc.o says f8d5abe8. Ignoring /lib/modules/2.4.18-4smp/kernel/net/sunrpc/sunrpc.o entry
Warning (compare_maps): mismatch on symbol nlm_debug , sunrpc says f8d5af0c, /lib/modules/2.4.18-4smp/kernel/net/sunrpc/sunrpc.o says f8d5abec. Ignoring /lib/modules/2.4.18-4smp/kernel/net/sunrpc/sunrpc.o entry
Warning (compare_maps): mismatch on symbol rpc_debug , sunrpc says f8d5af00, /lib/modules/2.4.18-4smp/kernel/net/sunrpc/sunrpc.o says f8d5abe0. Ignoring /lib/modules/2.4.18-4smp/kernel/net/sunrpc/sunrpc.o entry
Warning (map_ksym_to_module): cannot match loaded module ext3 to a unique module object. Trace may not be reliable.
Warning (compare_maps): mismatch on symbol sd , sd_mod says f882d240, /lib/modules/2.4.18-4smp/kernel/drivers/scsi/sd_mod.o says f882d1a0. Ignoring /lib/modules/2.4.18-4smp/kernel/drivers/scsi/sd_mod.o entry
Warning (compare_maps): scsi_mod symbol GPLONLY_scsi_reset_provider not found in /lib/modules/2.4.18-4smp/kernel/drivers/scsi/scsi_mod.o. Ignoring /lib/modules/2.4.18-4smp/kernel/drivers/scsi/scsi_mod.o entry
Warning (compare_maps): mismatch on symbol proc_scsi , scsi_mod says f8828654, /lib/modules/2.4.18-4smp/kernel/drivers/scsi/scsi_mod.o says f8826d88. Ignoring /lib/modules/2.4.18-4smp/kernel/drivers/scsi/scsi_mod.o entry
Warning (compare_maps): mismatch on symbol scsi_devicelist , scsi_mod says f8828680, /lib/modules/2.4.18-4smp/kernel/drivers/scsi/scsi_mod.o says f8826db4. Ignoring /lib/modules/2.4.18-4smp/kernel/drivers/scsi/scsi_mod.o entry
Warning (compare_maps): mismatch on symbol scsi_hostlist , scsi_mod says f882867c, /lib/modules/2.4.18-4smp/kernel/drivers/scsi/scsi_mod.o says f8826db0. Ignoring /lib/modules/2.4.18-4smp/kernel/drivers/scsi/scsi_mod.o entry
Warning (compare_maps): mismatch on symbol scsi_hosts , scsi_mod says f8828684, /lib/modules/2.4.18-4smp/kernel/drivers/scsi/scsi_mod.o says f8826db8. Ignoring /lib/modules/2.4.18-4smp/kernel/drivers/scsi/scsi_mod.o entry
Warning (compare_maps): mismatch on symbol scsi_logging_level , scsi_mod says f8828650, /lib/modules/2.4.18-4smp/kernel/drivers/scsi/scsi_mod.o says f8826d84. Ignoring /lib/modules/2.4.18-4smp/kernel/drivers/scsi/scsi_mod.o entry
invalid operand: 0000
CPU: 0
EIP: 0010:[<f884f9b4>] Tainted: PF
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010282
eax: 0000001d ebx: 00000000 ecx: c02eeec0 edx: 00003552
esi: 00000000 edi: f6a507e0 ebp: d3583400 esp: f6c75e44
ds: 0018 es: 0018 ss: 0018
Process kjournald (pid: 145, stackpage=f6c75000)
Stack: f8854181 00000196 000015f4 f6b06c00 00000000 00000000 f7fe3520 00000000
f6a507e0 d3583400 f884cdf4 f6a507e0 f7fe3520 f6c75e98 000017f0 50195cfc
00000000 00000fd4 e4d3f02c 00000003 f6a507e0 db272cd0 000017f0 c010de61
Call Trace: [<f8854181>] .rodata.str1.1 [jbd] 0x4e1
[<f884cdf4>] journal_commit_transaction [jbd] 0x7e4
[<c010de61>] timer_interrupt [kernel] 0xb1
[<f884f806>] kjournald [jbd] 0x136
[<f884f6b0>] commit_timeout [jbd] 0x0
[<c0107286>] kernel_thread [kernel] 0x26
[<f884f6d0>] kjournald [jbd] 0x0
Code: 0f 0b 5e 5f 8b 7c 24 28 8b 4f 0c 85 c9 74 2e c7 44 24 0c 01
>>EIP; f884f9b4 <[jbd]journal_write_metadata_buffer+74/350> <=====
Trace; f8854181 <[jbd].rodata.end+29c2/59b9>
Trace; f884cdf4 <[jbd]journal_commit_transaction+7e4/11e3>
Trace; c010de61 <timer_interrupt+b1/140>
Trace; f884f806 <[jbd]kjournald+136/1d0>
Trace; f884f6b0 <[jbd]commit_timeout+0/10>
Trace; c0107286 <kernel_thread+26/30>
Trace; f884f6d0 <[jbd]kjournald+0/1d0>
Code; f884f9b4 <[jbd]journal_write_metadata_buffer+74/350>
00000000 <_EIP>:
Code; f884f9b4 <[jbd]journal_write_metadata_buffer+74/350> <=====
0: 0f 0b ud2a <=====
Code; f884f9b6 <[jbd]journal_write_metadata_buffer+76/350>
2: 5e pop %esi
Code; f884f9b7 <[jbd]journal_write_metadata_buffer+77/350>
3: 5f pop %edi
Code; f884f9b8 <[jbd]journal_write_metadata_buffer+78/350>
4: 8b 7c 24 28 mov 0x28(%esp,1),%edi
Code; f884f9bc <[jbd]journal_write_metadata_buffer+7c/350>
8: 8b 4f 0c mov 0xc(%edi),%ecx
Code; f884f9bf <[jbd]journal_write_metadata_buffer+7f/350>
b: 85 c9 test %ecx,%ecx
Code; f884f9c1 <[jbd]journal_write_metadata_buffer+81/350>
d: 74 2e je 3d <_EIP+0x3d> f884f9f1 <[jbd]journal_write_metadata_buffer+b1/350>
Code; f884f9c3 <[jbd]journal_write_metadata_buffer+83/350>
f: c7 44 24 0c 01 00 00 movl $0x1,0xc(%esp,1)
Code; f884f9ca <[jbd]journal_write_metadata_buffer+8a/350>
16: 00
14 warnings and 5 errors issued. Results may not be reliable.
--Boundary_(ID_zXq3eNO7YhTejwbpiX6J/w)
Content-type: text/plain; charset=us-ascii; NAME=disk_state.log
Content-transfer-encoding: 7BIT
Content-disposition: attachment; filename=disk_state.log
Fri May 10 14:39:23 PDT 2002
################################################################################Volume Name Quota Used %Used Partition
ext2 5000000 1796177 36% 3%
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/hda1 2063504 1803376 155308 93% /
none 515524 0 515524 0% /dev/shm
/dev/hda6 1035660 605072 377980 62% /scswork
/dev/hda8 2601116 644916 1824068 27% /tmp
/dev/hda7 23333 1043 21086 5% /mnt/t
/dev/hda5 1035660 753456 229596 77% /usr/vice/cache
/dev/hda3 1035692 137108 845972 14% /var
AFS 9000000 0 9000000 0% /afs
total 4
-rwxr-xr-x 1 root root 398 May 8 12:20 monitor
Available sets:
cmlongterm inactive
cm inactive (dormant)
cmfx : 60 kbytes (allocated)
--Boundary_(ID_zXq3eNO7YhTejwbpiX6J/w)--