[OpenAFS] AFS crashed at access to "~/.IceAuthority"
Frédéric Grelot
fredericg_99@yahoo.fr
Tue, 17 Nov 2009 23:16:29 +0100 (CET)
Hi everybody,
I'm using openafs 1.4.11 (default version on ubuntu karmic), with a "standard" kerberos configuration, together with libnss-afs and nscd to allow user home directories to be seamlessly translated.
I just experienced several complete afs crashes, while I was trying to run evince (the pdf reader). It seems, according to dmesg, that it happens when evince tries to access the special file "~/.ICEauthority", which must be (I'm not sure) a socket file of some sort.
The session then hangs completely, and I couldn't find any other way than logging as root and "# reboot" to get my system back.
I managed to get the logs with dmesg : the first time, it mentionned .ICEauthority, which I deleted before trying again, but it crashed anyway.
The first log (syslog) is :
kernel: [ 9522.432659] type=1503 audit(1258492388.453:26): operation="open" pid=5987 parent=1 profile="/usr/bin/evince" requested_mask="r::" denied_mask="r::" fsuid=20001 ouid=20001 name="/afs/cell.tld/use=
r/u/us/user/.ICEauthority"
kernel: [ 9522.432947] openafs: Can't open inode 62749
kernel: [ 9522.433032] Modules linked in: (lots of stuff)
kernel: [ 9522.433193] RIP: 0010:[<ffffffffa0ff3d35>] [<ffffffffa0ff3d35>] osi_UFSOpen+0x1e5/0x240 [openafs]
kernel: [ 9522.433400] [<ffffffffa0fcf4ee>] afs_UFSWrite+0x3be/0x9f0 [openafs]
kernel: [ 9522.433476] [<ffffffffa0ff84f7>] afs_linux_writepage_sync+0x267/0x4a0 [openafs]
kernel: [ 9522.433540] [<ffffffffa0ff875d>] afs_linux_write_end+0x2d/0x50 [openafs]
kernel: [ 9522.433669] [<ffffffffa0ff2871>] ? crget+0x21/0xa0 [openafs]
kernel: [ 9522.433707] [<ffffffffa0fb1799>] ? PagInCred+0x49/0xd0 [openafs]
kernel: [ 9522.433750] [<ffffffffa0ff6af5>] afs_linux_write+0x285/0x3f0 [openafs]
kernel: [ 9522.433879] RIP [<ffffffffa0ff3d35>] osi_UFSOpen+0x1e5/0x240 [openafs]
kernel: [ 9526.345486] type=1503 audit(1258492392.363:28): operation="open" pid=5990 parent=5989 profile="/usr/bin/evince" requested_mask="r::" denied_mask="r::" fsuid=20001 ouid=20001 name="/afs/cell.tld/u/us/user/.ICEauthority"
The last one is a bit different (after .ICEauthority was manually deleted), since it mentions "/var" as the erroneous path, still accessed by openafs :
[ 519.517002] __ratelimit: 6 callbacks suppressed
[ 519.517010] type=1503 audit(1258494347.339:26): operation="open" pid=3672 parent=3671 profile="/usr/bin/evince-thumbnailer" requested_mask="::rw" denied_mask="::rw" fsuid=20001 ouid=0 name="/var"
[ 519.517021] openafs: Can't open inode 104085
[ 519.517077] ------------[ cut here ]------------
[ 519.517082] kernel BUG at /var/lib/dkms/openafs/1.4.11/build/src/libafs/MODLOAD-2.6.31-14-generic-MP/osi_file.c:87!
[ 519.517088] invalid opcode: 0000 [#1] SMP
[ 519.517096] last sysfs file: /sys/devices/virtual/block/dm-3/uevent
[ 519.517102] CPU 0
[ 519.517107] Modules linked in: binfmt_misc ppdev openafs(P) bnep ipmi_msghandler vboxnetadp vboxnetflt vboxdrv kvm_intel kvm kqemu dlm configfs nfs lockd nfs_acl auth_rpcgss sunrpc bridge stp snd_hda_codec_si3054 snd_hda_codec_realtek snd_usb_audio snd_usb_lib snd_seq_dummy arc4 snd_seq_oss snd_hda_intel snd_hda_codec snd_seq_midi ecb iwlagn snd_rawmidi joydev iwlcore snd_seq_midi_event snd_hwdep sdhci_pci nvidia(P) uvcvideo snd_seq snd_pcm_oss iptable_filter mac80211 snd_mixer_oss snd_seq_device videodev sdhci snd_pcm tpm_infineon snd_timer ip_tables asus_laptop v4l1_compat psmouse tpm v4l2_compat_ioctl32 serio_raw lp btusb tpm_bios led_class snd_page_alloc snd parport cfg80211 soundcore x_tables raid10 raid456 usbhid raid6_pq async_xor async_memcpy async_tx raid1 raid0 multipath linear dm_raid45 xor ohci1394 ieee1394 video output atl1 mii intel_agp
[ 519.517263] Pid: 3672, comm: evince-thumbnai Tainted: P 2.6.31-14-generic #48-Ubuntu M51Sn
[ 519.517269] RIP: 0010:[<ffffffffa10c1d35>] [<ffffffffa10c1d35>] osi_UFSOpen+0x1e5/0x240 [openafs]
[ 519.517325] RSP: 0018:ffff88008edf19f8 EFLAGS: 00010296
[ 519.517331] RAX: 0000000000000036 RBX: ffff8800b5c05000 RCX: 000000000000001e
[ 519.517337] RDX: 0000000000000000 RSI: 0000000000000086 RDI: 0000000000000246
[ 519.517342] RBP: ffff88008edf1a28 R08: 0000000000000032 R09: 000000000000eb2d
[ 519.517348] R10: 0000000000000005 R11: 0000000000000000 R12: 0000000000019695
[ 519.517354] R13: ffffc90013e66740 R14: 0000000000001000 R15: 0000000000000001
[ 519.517361] FS: 00007f6247f04790(0000) GS:ffff880001a00000(0000) knlGS:0000000000000000
[ 519.517367] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 519.517373] CR2: 00007fff988ff448 CR3: 000000008b1c5000 CR4: 00000000000026f0
[ 519.517379] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 519.517385] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 519.517392] Process evince-thumbnai (pid: 3672, threadinfo ffff88008edf0000, task ffff880093935ac0)
[ 519.517397] Stack:
[ 519.517401] 0000000000019695 ffffc90013e666a8 ffff880091d85f80 ffffc90013e66740
[ 519.517412] <0> ffffc90013e666a8 ffff880091d85f80 ffff88008edf1b28 ffffffffa1096e06
[ 519.517425] <0> 0000000000000000 ffffc90013e666a8 ffff880000000000 ffff880091d86208
[ 519.517439] Call Trace:
[ 519.517482] [<ffffffffa1096e06>] afs_UFSRead+0x256/0x1270 [openafs]
[ 519.517528] [<ffffffffa10c0871>] ? crget+0x21/0xa0 [openafs]
[ 519.517572] [<ffffffffa10c6bec>] afs_linux_readpage+0x12c/0x460 [openafs]
[ 519.517586] [<ffffffff81277827>] ? radix_tree_insert+0x1c7/0x210
[ 519.517598] [<ffffffff810f1975>] ? __inc_zone_page_state+0x25/0x30
[ 519.517607] [<ffffffff810db7cc>] T.768+0xfc/0x440
[ 519.517616] [<ffffffff810dbbc6>] generic_file_aio_read+0xb6/0x1d0
[ 519.517627] [<ffffffff8111ec32>] do_sync_read+0xf2/0x130
[ 519.517637] [<ffffffff81078b30>] ? autoremove_wake_function+0x0/0x40
[ 519.517681] [<ffffffffa10c0871>] ? crget+0x21/0xa0 [openafs]
[ 519.517719] [<ffffffffa107f799>] ? PagInCred+0x49/0xd0 [openafs]
[ 519.517763] [<ffffffffa10c4d60>] afs_linux_read+0x100/0x2a0 [openafs]
[ 519.517774] [<ffffffff8142007c>] ? apparmor_file_permission+0xac/0xd0
[ 519.517784] [<ffffffff8111f215>] vfs_read+0xb5/0x1a0
[ 519.517792] [<ffffffff8111f81c>] sys_read+0x4c/0x80
[ 519.517802] [<ffffffff81012002>] system_call_fastpath+0x16/0x1b
[ 519.517807] Code: d3 4d 46 e0 0f 0b eb fe 48 c7 c7 48 b8 0d a1 31 c0 e8 c1 4d 46 e0 0f 0b eb fe 44 89 e6 48 c7 c7 98 b9 0d a1 31 c0 e8 ac 4d 46 e0 <0f> 0b eb fe 44 89 e6 48 c7 c7 68 b9 0d a1 e8 99 4d 46 e0 0f 0b
[ 519.517895] RIP [<ffffffffa10c1d35>] osi_UFSOpen+0x1e5/0x240 [openafs]
[ 519.517940] RSP <ffff88008edf19f8>
[ 519.517947] ---[ end trace d5ed6153240e0aa0 ]---
I may mention that /var is partition by itself, in ext4 format on top of lvm :
# df /var -ih
Sys. de fich. Inodes IUtil. ILib. %IUti. Monté sur
/dev/mapper/vg00-local--var
239K 109K 130K 46% /var
# df /var -h
Sys. de fich. Tail. Occ. Disp. %Occ. Monté sur
/dev/mapper/vg00-local--var
3,7G 1,5G 2,1G 42% /var
I set a cache of 500Mb, with some custom options to tune up network transfers by slicing my files in small chunks with :
/sbin/afsd -afsdb -dynroot -fakestat -chunksize 13
(this last parameter, chunksize 13, came as a conclusion of a discussion I initiated few months ago to avoid my network connection to drop at a ping of >2000ms when making lots of upload to the server)
I will appreciate any help on this issue : it is not a huge problem by itself (I can easily replace evince with okular for example), but I don't like when I just drop a problem without really solving it!
Thanks in advance,
Frederic.