[OpenAFS] Re: Strange problem with OpenAFS on Virtual Machine
Claudio Prono
claudio.prono@atpss.net
Fri, 09 Jul 2010 17:05:02 +0200
This is a multi-part message in MIME format.
--------------010205030107000308050800
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Andrew Deason ha scritto:
> On Fri, 09 Jul 2010 15:43:56 +0200
> Claudio Prono <claudio.prono@atpss.net> wrote:
>
>
>> Hi to all,
>>
>> I am doing some test with OpenAFS on a virtual machine, and i have a
>> strange problem.
>>
>
> What kind? Xen, VMware, ...? Is this Linux (what kernel)?
>
>
Vmware, version 2.0.2 on OpenSuSE 11.2 64Bit. Kernel 2.6.31.8-0.1.
>> And then the afs hangs. If i try to access to the AFS, is not
>> possible...
>>
>
> Are you using dynroot? Are you sure your root.afs/root.cell is accessible?
>
>
Yes, i use Dynroot, and yes root.afs/root.cell is accessible.
>> Any hint of what is going on? The Physic machine have 12 Gb of Ram, so
>> isn't a problem of memory i think.... I have looked also into the logs
>> in /var/log/openafs, and everything is ok.... no hint to debug that
>> situation...
>>
>
> Syslog or dmesg would be more likely to indicate a problem.
>
>
dmesg says this:
Found system call table at 0xfffffffe (exported)
Address 0xfffffffe is not writable.
System call hooks will not be installed; proceeding anyway
Starting AFS cache scan...<4>printk: 3 messages suppressed.
allocation failed: out of vmalloc space - use vmalloc=<size> to increase
size.
allocation failed: out of vmalloc space - use vmalloc=<size> to increase
size.
allocation failed: out of vmalloc space - use vmalloc=<size> to increase
size.
allocation failed: out of vmalloc space - use vmalloc=<size> to increase
size.
allocation failed: out of vmalloc space - use vmalloc=<size> to increase
size.
allocation failed: out of vmalloc space - use vmalloc=<size> to increase
size.
allocation failed: out of vmalloc space - use vmalloc=<size> to increase
size.
allocation failed: out of vmalloc space - use vmalloc=<size> to increase
size.
allocation failed: out of vmalloc space - use vmalloc=<size> to increase
size.
allocation failed: out of vmalloc space - use vmalloc=<size> to increase
size.
afs_osi_Alloc: Can't vmalloc 65536 bytes.
afsd: memCache allocation failure at 95168 KB.
afsd: memory cache too large for available memory.
afsd: AFS files cannot be accessed.
found 0 non-empty cache files (0%).
BUG: unable to handle kernel NULL pointer dereference at 00000147
IP: [<f90bf68c>] :libafs:afs_GetDownD+0xb6/0x674
*pdpt = 000000003359b001 *pde = 0000000000000000
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:12.0/class
Modules linked in: libafs(P) vmsync vmmemctl vmblock iptable_filter
ip_tables ip6_tables x_tables fuse ext2 loop dm_mod ppdev parport_pc
parport rtc_cmos rtc_core rtc_lib pcnet32 vmxnet mii container i2c_piix4
i2c_core ac button sg sr_mod shpchp cdrom pci_hotplug intel_agp agpgart
sd_mod edd ext3 mbcache jbd fan ata_piix libata dock BusLogic scsi_mod
thermal processor [last unloaded: speedstep_lib]
Pid: 2749, comm: afs_cachetrim Tainted: P N (2.6.25.20-0.1-pae #1)
EIP: 0060:[<f90bf68c>] EFLAGS: 00010206 CPU: 0
EIP is at afs_GetDownD+0xb6/0x674 [libafs]
EAX: 000000d7 EBX: f11372e4 ECX: f11372e4 EDX: f9132528
ESI: 00000008 EDI: 00000000 EBP: f3d71fac ESP: f3d71e58
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process afs_cachetrim (pid: 2749, ti=f3d70000 task=f2cb30e0
task.ti=f3d70000)
Stack: f3d71fb8 00000000 f9132528 f9132528 00000000 00000000 00000000
00000004
00000000 00000010 00000000 00000000 000007e5 f9132528 00000000
00000000
00000000 f11372e4 00000000 00000000 00000000 00000000 00000000
00000000
Call Trace:
[<f90c2a28>] afs_CacheTruncateDaemon+0x114/0x380 [libafs]
[<f91018e3>] afsd_thread+0x34b/0x5c8 [libafs]
[<c0106d37>] kernel_thread_helper+0x7/0x10
=======================
Code: e9 66 05 00 00 8b 95 e0 fe ff ff 0f b6 14 32 89 95 b0 fe ff ff 80
e2 46 0f 85 04 01 00 00 8b 95 b4 fe ff ff 8b 04 b2 85 c0 74 0b <66> 83
78 70 00 0f 85 ec 00 00 00 8b 85 b8 fe ff ff f6 85 b0 fe
EIP: [<f90bf68c>] afs_GetDownD+0xb6/0x674 [libafs] SS:ESP 0068:f3d71e58
---[ end trace 6d6609a58d98c17c ]---
> To debug... first try running 'cmdebug <client>'; give the output if it
> outputs anything (even if it doesn't look interesting).
>
>
cmdebug -server afs-test hangs, no output at all...
> If you're on Linux and have an appropriately-configured kernel,
> 'echo t > /proc/sysrq-trigger' will give a trace of all processes in
> dmesg/syslog. Put that in a pastebin or something, if you're okay with
> giving out a listing of all the processes on that machine.
>
>
Here is part of the output:
Sched Debug Version: v0.07, 2.6.25.20-0.1-pae #1
now at 6078856.829940 msecs
.sysctl_sched_latency : 20.000000
.sysctl_sched_min_granularity : 4.000000
.sysctl_sched_wakeup_granularity : 5.000000
.sysctl_sched_batch_wakeup_granularity : 10.000000
.sysctl_sched_child_runs_first : 0.000001
.sysctl_sched_features : 15
cpu#0, 2110.751 MHz
.nr_running : 1
.load : 1024
.nr_switches : 679665
.nr_load_updates : 135738
.nr_uninterruptible : 8
.jiffies : 1444708
.next_balance : 1.444772
.curr->pid : 3407
.clock : 6069562.101723
.idle_clock : 5975798.490928
.prev_clock_raw : 6088500.878042
.clock_warps : 0
.clock_overflows : 629337
.clock_underflows : 7855
.clock_deep_idle_events : 1
.clock_max_delta : 4.000245
.cpu_load[0] : 1024
.cpu_load[1] : 513
.cpu_load[2] : 284
.cpu_load[3] : 198
.cpu_load[4] : 163
cfs_rq
.exec_clock : 66685.662816
.MIN_vruntime : 0.000001
.min_vruntime : 31027.376357
.max_vruntime : 0.000001
.spread : 0.000000
.spread0 : 0.000000
.nr_running : 1
.load : 1024
.bkl_count : 3775
.nr_spread_over : 83
runnable tasks:
task PID tree-key switches prio
exec-runtime sum-exec sum-sleep
----------------------------------------------------------------------------------------------------------
R bash 3407 31007.376360 461 120
31007.376360 112.477999 382268.403908
--
--------------------------------------------------------------------------------
Claudio Prono OPST
System Developer
Gsm: +39-349-54.33.258
@PSS Srl Tel: +39-011-32.72.100
Via San Bernardino, 17 Fax: +39-011-32.46.497
10141 Torino - ITALY http://atpss.net/disclaimer
--------------------------------------------------------------------------------
PGP Key - http://keys.atpss.net/c_prono.asc
--------------010205030107000308050800
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
<br>
<br>
Andrew Deason ha scritto:
<blockquote cite="mid:20100709094546.bb0cc37e.adeason@sinenomine.net"
type="cite">
<pre wrap="">On Fri, 09 Jul 2010 15:43:56 +0200
Claudio Prono <a class="moz-txt-link-rfc2396E" href="mailto:claudio.prono@atpss.net"><claudio.prono@atpss.net></a> wrote:
</pre>
<blockquote type="cite">
<pre wrap="">Hi to all,
I am doing some test with OpenAFS on a virtual machine, and i have a
strange problem.
</pre>
</blockquote>
<pre wrap=""><!---->
What kind? Xen, VMware, ...? Is this Linux (what kernel)?
</pre>
</blockquote>
Vmware, version 2.0.2 on OpenSuSE 11.2 64Bit. Kernel 2.6.31.8-0.1.<br>
<blockquote cite="mid:20100709094546.bb0cc37e.adeason@sinenomine.net"
type="cite">
<pre wrap=""></pre>
<blockquote type="cite">
<pre wrap="">And then the afs hangs. If i try to access to the AFS, is not
possible...
</pre>
</blockquote>
<pre wrap=""><!---->
Are you using dynroot? Are you sure your root.afs/root.cell is accessible?
</pre>
</blockquote>
Yes, i use Dynroot, and yes root.afs/root.cell is accessible.<br>
<blockquote cite="mid:20100709094546.bb0cc37e.adeason@sinenomine.net"
type="cite">
<pre wrap=""></pre>
<blockquote type="cite">
<pre wrap="">Any hint of what is going on? The Physic machine have 12 Gb of Ram, so
isn't a problem of memory i think.... I have looked also into the logs
in /var/log/openafs, and everything is ok.... no hint to debug that
situation...
</pre>
</blockquote>
<pre wrap=""><!---->
Syslog or dmesg would be more likely to indicate a problem.
</pre>
</blockquote>
dmesg says this:<br>
<br>
Found system call table at 0xfffffffe (exported)<br>
Address 0xfffffffe is not writable.<br>
System call hooks will not be installed; proceeding anyway<br>
Starting AFS cache scan...<4>printk: 3 messages suppressed.<br>
allocation failed: out of vmalloc space - use vmalloc=<size> to
increase size.<br>
allocation failed: out of vmalloc space - use vmalloc=<size> to
increase size.<br>
allocation failed: out of vmalloc space - use vmalloc=<size> to
increase size.<br>
allocation failed: out of vmalloc space - use vmalloc=<size> to
increase size.<br>
allocation failed: out of vmalloc space - use vmalloc=<size> to
increase size.<br>
allocation failed: out of vmalloc space - use vmalloc=<size> to
increase size.<br>
allocation failed: out of vmalloc space - use vmalloc=<size> to
increase size.<br>
allocation failed: out of vmalloc space - use vmalloc=<size> to
increase size.<br>
allocation failed: out of vmalloc space - use vmalloc=<size> to
increase size.<br>
allocation failed: out of vmalloc space - use vmalloc=<size> to
increase size.<br>
afs_osi_Alloc: Can't vmalloc 65536 bytes.<br>
afsd: memCache allocation failure at 95168 KB.<br>
afsd: memory cache too large for available memory.<br>
afsd: AFS files cannot be accessed.<br>
<br>
found 0 non-empty cache files (0%).<br>
BUG: unable to handle kernel NULL pointer dereference at 00000147<br>
IP: [<f90bf68c>] :libafs:afs_GetDownD+0xb6/0x674<br>
*pdpt = 000000003359b001 *pde = 0000000000000000<br>
Oops: 0000 [#1] SMP<br>
last sysfs file: /sys/devices/pci0000:00/0000:00:12.0/class<br>
Modules linked in: libafs(P) vmsync vmmemctl vmblock iptable_filter
ip_tables ip6_tables x_tables fuse ext2 loop dm_mod ppdev parport_pc
parport rtc_cmos rtc_core rtc_lib pcnet32 vmxnet mii container
i2c_piix4 i2c_core ac button sg sr_mod shpchp cdrom pci_hotplug
intel_agp agpgart sd_mod edd ext3 mbcache jbd fan ata_piix libata dock
BusLogic scsi_mod thermal processor [last unloaded: speedstep_lib]<br>
<br>
Pid: 2749, comm: afs_cachetrim Tainted: P N (2.6.25.20-0.1-pae
#1)<br>
EIP: 0060:[<f90bf68c>] EFLAGS: 00010206 CPU: 0<br>
EIP is at afs_GetDownD+0xb6/0x674 [libafs]<br>
EAX: 000000d7 EBX: f11372e4 ECX: f11372e4 EDX: f9132528<br>
ESI: 00000008 EDI: 00000000 EBP: f3d71fac ESP: f3d71e58<br>
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068<br>
Process afs_cachetrim (pid: 2749, ti=f3d70000 task=f2cb30e0
task.ti=f3d70000)<br>
Stack: f3d71fb8 00000000 f9132528 f9132528 00000000 00000000 00000000
00000004<br>
00000000 00000010 00000000 00000000 000007e5 f9132528 00000000
00000000<br>
00000000 f11372e4 00000000 00000000 00000000 00000000 00000000
00000000<br>
Call Trace:<br>
[<f90c2a28>] afs_CacheTruncateDaemon+0x114/0x380 [libafs]<br>
[<f91018e3>] afsd_thread+0x34b/0x5c8 [libafs]<br>
[<c0106d37>] kernel_thread_helper+0x7/0x10<br>
=======================<br>
Code: e9 66 05 00 00 8b 95 e0 fe ff ff 0f b6 14 32 89 95 b0 fe ff ff 80
e2 46 0f 85 04 01 00 00 8b 95 b4 fe ff ff 8b 04 b2 85 c0 74 0b
<66> 83 78 70 00 0f 85 ec 00 00 00 8b 85 b8 fe ff ff f6 85 b0 fe<br>
EIP: [<f90bf68c>] afs_GetDownD+0xb6/0x674 [libafs] SS:ESP
0068:f3d71e58<br>
---[ end trace 6d6609a58d98c17c ]---<br>
<br>
<blockquote cite="mid:20100709094546.bb0cc37e.adeason@sinenomine.net"
type="cite">
<pre wrap="">To debug... first try running 'cmdebug <client>'; give the output if it
outputs anything (even if it doesn't look interesting).
</pre>
</blockquote>
cmdebug -server afs-test hangs, no output at all...<br>
<br>
<blockquote cite="mid:20100709094546.bb0cc37e.adeason@sinenomine.net"
type="cite">
<pre wrap="">If you're on Linux and have an appropriately-configured kernel,
'echo t > /proc/sysrq-trigger' will give a trace of all processes in
dmesg/syslog. Put that in a pastebin or something, if you're okay with
giving out a listing of all the processes on that machine.
</pre>
</blockquote>
Here is part of the output:<br>
<br>
Sched Debug Version: v0.07, 2.6.25.20-0.1-pae #1<br>
now at 6078856.829940 msecs<br>
.sysctl_sched_latency : 20.000000<br>
.sysctl_sched_min_granularity : 4.000000<br>
.sysctl_sched_wakeup_granularity : 5.000000<br>
.sysctl_sched_batch_wakeup_granularity : 10.000000<br>
.sysctl_sched_child_runs_first : 0.000001<br>
.sysctl_sched_features : 15<br>
<br>
cpu#0, 2110.751 MHz<br>
.nr_running : 1<br>
.load : 1024<br>
.nr_switches : 679665<br>
.nr_load_updates : 135738<br>
.nr_uninterruptible : 8<br>
.jiffies : 1444708<br>
.next_balance : 1.444772<br>
.curr->pid : 3407<br>
.clock : 6069562.101723<br>
.idle_clock : 5975798.490928<br>
.prev_clock_raw : 6088500.878042<br>
.clock_warps : 0<br>
.clock_overflows : 629337<br>
.clock_underflows : 7855<br>
.clock_deep_idle_events : 1<br>
.clock_max_delta : 4.000245<br>
.cpu_load[0] : 1024<br>
.cpu_load[1] : 513<br>
.cpu_load[2] : 284<br>
.cpu_load[3] : 198<br>
.cpu_load[4] : 163<br>
<br>
cfs_rq<br>
.exec_clock : 66685.662816<br>
.MIN_vruntime : 0.000001<br>
.min_vruntime : 31027.376357<br>
.max_vruntime : 0.000001<br>
.spread : 0.000000<br>
.spread0 : 0.000000<br>
.nr_running : 1<br>
.load : 1024<br>
.bkl_count : 3775<br>
.nr_spread_over : 83<br>
<br>
runnable tasks:<br>
task PID tree-key switches prio
exec-runtime sum-exec sum-sleep<br>
----------------------------------------------------------------------------------------------------------<br>
R bash 3407 31007.376360 461 120
31007.376360 112.477999 382268.403908<br>
<br>
<br>
<pre class="moz-signature" cols="72">--
--------------------------------------------------------------------------------
Claudio Prono OPST
System Developer
Gsm: +39-349-54.33.258
@PSS Srl Tel: +39-011-32.72.100
Via San Bernardino, 17 Fax: +39-011-32.46.497
10141 Torino - ITALY <a class="moz-txt-link-freetext" href="http://atpss.net/disclaimer">http://atpss.net/disclaimer</a>
--------------------------------------------------------------------------------
PGP Key - <a class="moz-txt-link-freetext" href="http://keys.atpss.net/c_prono.asc">http://keys.atpss.net/c_prono.asc</a>
</pre>
</body>
</html>
--------------010205030107000308050800--