[OpenAFS-devel] Linux 2.6.12 kernel BUG at fs/namei.c:1189

Harald Barth haba@pdc.kth.se
Thu, 19 Jan 2006 13:27:27 +0100 (MET)


Sorry, I have to bring back this thread from the dead. I have a user that
had me run into the same BUG.

> static inline int may_delete(struct inode *dir,struct dentry *victim,int isdir)
> {
>         int error;
> 
>         if (!victim->d_inode)
>                 return -ENOENT;
> 
>         BUG_ON(victim->d_parent->d_inode != dir);

My kernel is a 2.6.9 and it is of course at another row in the source,
but I got an oops at the "same place", so I think it is the same
problem.

> That makes me wonder if this is relevant somehow to the fact that what I
> was trying to move was a mount point, but possibly not.  It's certainly
> not reliable; I moved several other mount points before trying to move
> this one.

I don't think my user did move any mountpoints, as his volume's mount
poiny has been ~/WORK for ages.

> We *did* run into something that looked like this problem on another Linux
> system here with files that weren't mount points, but we could never
> manage to reproduce it and never got BUG output, so that may or may not be
> the same problem.  :/

BUG output and the user say he tried to do a rm file and then all
access to that directory hung. The thing that puzzles me is that
normally people don't use the /afs/.pdc.kth.se/home/$USER way through
the RO to come to their RW home. I have not been able to reproduce it
that way either. In spite of ~1000 machines with OpenAFS, the bug has
happened to me only twice, once in October and now some days ago. Not
precisely a lot of data to debug. Not that I'd know how to validate
the cached dentries anyway. The client version is 1.4.0 btw.

I saw the patch in
https://lists.openafs.org/pipermail/openafs-devel/2005-December/013334.html
by Chas. Did that help you Russ? I wonder if it would help me. At the
current rate this appears, I would maybe know in april....

Russ wrote in another email:
> I mv mount points all the time; maybe I'm weird?

You want me to say "Yes" here, don't you? ;-) ;-)

Can you write a script for me that makes this go "BUG"? I'd put that into
the Arla test suite right away (and we'd know how a machine with Arla reacts
on this).

Harald.