[OpenAFS] ARM64 5.4.0-42 with 1.8.5 oops in afs_CellNumValid
Benjamin Kaduk
kaduk@mit.edu
Sat, 29 Aug 2020 18:09:01 -0700
On Wed, Aug 26, 2020 at 05:48:45AM +1000, Ian Wienand wrote:
> Hello,
>
> These messages and oops popped up on one of our ARM64 hosts, running
> 1.8.5 client on Ubuntu Focal
>
> # uname -a
> Linux mirror02 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux
>
> ---
> [Tue Aug 25 09:43:14 2020] openafs: loading out-of-tree module taints kernel.
> [Tue Aug 25 09:43:14 2020] openafs: module license 'http://www.openafs.org/dl/license10.html' taints kernel.
> [Tue Aug 25 09:43:14 2020] Disabling lock debugging due to kernel taint
> [Tue Aug 25 09:43:14 2020] openafs: module verification failed: signature and/or required key missing - tainting kernel
> [Tue Aug 25 09:43:14 2020] Key type afs_pag registered
> [Tue Aug 25 09:43:16 2020] enabling dynamically allocated vcaches
> [Tue Aug 25 09:43:16 2020] Starting AFS cache scan...
> [Tue Aug 25 09:44:46 2020] Key type afs_pag unregistered
> [Tue Aug 25 09:44:46 2020] nonzero refcount in shutdown_osisleep()
> [Tue Aug 25 09:44:46 2020] nonzero refcount in shutdown_osisleep()
> [Tue Aug 25 09:44:46 2020] nonzero refcount in shutdown_osisleep()
> [Tue Aug 25 09:44:46 2020] nonzero refcount in shutdown_osisleep()
> [Tue Aug 25 09:44:46 2020] Unable to handle kernel paging request at virtual address 007999dede997087
> [Tue Aug 25 09:44:46 2020] Mem abort info:
> [Tue Aug 25 09:44:46 2020] ESR = 0x96000004
> [Tue Aug 25 09:44:46 2020] EC = 0x25: DABT (current EL), IL = 32 bits
> [Tue Aug 25 09:44:46 2020] SET = 0, FnV = 0
> [Tue Aug 25 09:44:46 2020] EA = 0, S1PTW = 0
> [Tue Aug 25 09:44:46 2020] Data abort info:
> [Tue Aug 25 09:44:46 2020] ISV = 0, ISS = 0x00000004
> [Tue Aug 25 09:44:46 2020] CM = 0, WnR = 0
> [Tue Aug 25 09:44:46 2020] [007999dede997087] address between user and kernel address ranges
> [Tue Aug 25 09:44:46 2020] Internal error: Oops: 96000004 [#1] SMP
> [Tue Aug 25 09:44:46 2020] Modules linked in: openafs(POE-) ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bpfilter nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua joydev input_leds sch_fq_codel drm ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce virtio_net net_failover virtio_blk failover aes_neon_bs aes_neon_blk aes_ce_blk crypto_simd cryptd aes_ce_cipher
> [Tue Aug 25 09:44:46 2020] CPU: 2 PID: 70020 Comm: afsd Tainted: P OE 5.4.0-42-generic #46-Ubuntu
> [Tue Aug 25 09:44:46 2020] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
> [Tue Aug 25 09:44:46 2020] pstate: 80400005 (Nzcv daif +PAN -UAO)
> [Tue Aug 25 09:44:46 2020] pc : afs_CellNumValid+0x54/0xd8 [openafs]
> [Tue Aug 25 09:44:46 2020] lr : afs_UFSGetDSlot+0x1ac/0x548 [openafs]
> [Tue Aug 25 09:44:46 2020] sp : ffff80001382b9f0
> [Tue Aug 25 09:44:46 2020] x29: ffff80001382b9f0 x28: 00000000fffffffb
> [Tue Aug 25 09:44:46 2020] x27: 0000000000000001 x26: 000000000005f14f
> [Tue Aug 25 09:44:46 2020] x25: 0000000001db68c4 x24: 0000000000000000
> [Tue Aug 25 09:44:46 2020] x23: ffff0001df052d00 x22: ffff80000932b000
> [Tue Aug 25 09:44:46 2020] x21: ffff80000932b518 x20: 0000000000000000
> [Tue Aug 25 09:44:46 2020] x19: 7f7999dede99707f x18: 0000000000000000
> [Tue Aug 25 09:44:46 2020] x17: 0000000000000000 x16: 0000000000000000
> [Tue Aug 25 09:44:46 2020] x15: 0000000000000060 x14: ffff0001df052d00
> [Tue Aug 25 09:44:46 2020] x13: ffff0001df052d00 x12: 0000000000000000
> [Tue Aug 25 09:44:46 2020] x11: ffff800011d3d000 x10: 0000aaaad6538508
> [Tue Aug 25 09:44:46 2020] x9 : fefefefefefefeff x8 : ffff0001df052d00
> [Tue Aug 25 09:44:46 2020] x7 : 0000aaaad6538508 x6 : 0000000000000000
> [Tue Aug 25 09:44:46 2020] x5 : fffffe00047186b4 x4 : 0000000000000018
> [Tue Aug 25 09:44:46 2020] x3 : 0000000000306910 x2 : ffff0001df052d00
> [Tue Aug 25 09:44:46 2020] x1 : 000000000000004e x0 : 0000000000000000
> [Tue Aug 25 09:44:46 2020] Call trace:
> [Tue Aug 25 09:44:46 2020] afs_CellNumValid+0x54/0xd8 [openafs]
It could be related to https://gerrit.openafs.org/14093, especially if
there might have been something reading in /proc/fs/openafs/CellServDB at
the same time.
Note that this was pulled up to 1.8.x as 14284 and should be included in
1.8.7pre1, which is pending.
-Ben
> [Tue Aug 25 09:44:46 2020] afs_UFSGetDSlot+0x1ac/0x548 [openafs]
> [Tue Aug 25 09:44:46 2020] afs_InitCacheFile+0xac/0x620 [openafs]
> [Tue Aug 25 09:44:46 2020] afs_syscall_call+0xc24/0x18d0 [openafs]
> [Tue Aug 25 09:44:46 2020] afs_syscall+0xe4/0x588 [openafs]
> [Tue Aug 25 09:44:46 2020] afs_unlocked_ioctl+0x78/0xe0 [openafs]
> [Tue Aug 25 09:44:46 2020] proc_reg_unlocked_ioctl+0x80/0xc8
> [Tue Aug 25 09:44:46 2020] do_vfs_ioctl+0xa54/0xe58
> [Tue Aug 25 09:44:46 2020] ksys_ioctl+0x84/0xb8
> [Tue Aug 25 09:44:46 2020] __arm64_sys_ioctl+0x28/0x58
> [Tue Aug 25 09:44:46 2020] el0_svc_common.constprop.0+0xdc/0x1d8
> [Tue Aug 25 09:44:46 2020] el0_svc_handler+0x34/0xa0
> [Tue Aug 25 09:44:46 2020] el0_svc+0x10/0x14
> [Tue Aug 25 09:44:46 2020] Code: b5000093 14000006 f9400273 b4000093 (b9400a61)
> [Tue Aug 25 09:44:46 2020] ---[ end trace 76e155592eabb925 ]---
> ---
>
> I can not find any errors related to this client in the server logs
> (at all, but particularly around this time).
>
> This server is not under any load at this point; it seems this
> happened just after the module was installed when it would have been
> going through it's orchestration process. This would mean not much
> was reading from AFS; perhaps just apache getting ready to serve it.
>
> I'm not sure if I can replicate it, at this point.
>
> -i
>
> _______________________________________________
> OpenAFS-info mailing list
> OpenAFS-info@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-info