[OpenAFS] Re: BUG: unable to handle kernel paging request at 0000000000004f62

Marc Dionne marc.c.dionne@gmail.com
Wed, 18 Dec 2013 16:21:28 -0500


On Wed, Dec 18, 2013 at 2:47 PM, Jose Manuel dos Santos Calhariz
<jose.calhariz@tecnico.ulisboa.pt> wrote:
> On 18-12-2013 04:03, Andrew Deason wrote:
>>
>> On Tue, 17 Dec 2013 15:02:59 +0000
>> Jose Manuel dos Santos Calhariz<jose.calhariz@tecnico.ulisboa.pt>  wrote:
>>
>>> I have a virtual machine that since I have upgrade it to Debian wheezy
>>> (v7.0), Linux kernel 3.2, it started to give BUG messages.
>>> This machine runs every night, tar commands to do, backups of the files
>>> of AFS.  Usually at the second night I get this BUG messages
>>> and some of the running tar stops.
>>>
>>> The openafs kernel module is the same version previous and after the
>>> upgrade.  So this can be a possible incompatibility between
>>> kernel 3.2 and openafs 1.6.5., where the kernel 2.6.32 worked fine.
>>
>> I'm a little confused; what did you upgrade from? OpenAFS 1.6.5 is not
>> in squeeze (even in backports). Do you just mean that you upgraded the
>> kernel from 2.6.32 to 3.2, but the machine in general was running
>> wheezy? Or were you upgrading from squeeze, but somehow had OpenAFS
>> 1.6.5 on it?
>
>
> I have done a personal backport of OpenAFS 1.6.5 for squeeze.  So the
> machine was running
> squeeze, kernel 2.6.32, for several months without problems. Now is wheezy
> running openafs-client
> 1.6.5 from backports and openafs-module-dkms 1.6.5.
>
>
>
>
>>
>> Can you check 'rxdebug<client>  7001 -version' and make sure that the
>> version number and 'built' date make sense? It should not be possible
>> to be running the old kernel module or anything like that, but just as a
>> sanity check...
>
>
> In the meanwhile I have made an upgrade of the kernel module to 1.6.5.1. So
> 'rxdebug localhost 7001 -version' gives 'AFS version:  OpenAFS
> 1.6.5.1-1-debian built  2013-12-17'
>
>
>
>>> [76628.451414] BUG: unable to handle kernel paging request at
>>> 0000000000004f62
>>> [76628.452165] IP: [<ffffffffa037988c>] lock_page+0x13/0x2c [openafs]
>>
>> What filesystem are you using for your openafs cache?
>>
>> If you're using a weird filesystem for that, that might explain this,
>> but to try and get more information:
>
>
> I am using a tmpfs filesystem.

Ah, that would be the key here.  There is a problem with kernels 3.1+
and Openafs if you use tmpfs to hold the cache.  There is a fix that
will be part of the upcoming 1.6.6 release, but as far as I can tell
it has not been included in 1.6.5 or 1.6.5.1.

Marc