[OpenAFS-devel] Linux readpage handler

Thu, 26 May 2011 00:48:54 -0400

On Wed, May 25, 2011 at 5:52 PM, Andrew Deason <adeason@sinenomine.net> wrote:
> Brief summary: I'm asking a question about what our Linux readpage
> handler is allowed to do. Some Red Hat folks have been asked about this,
> but so far the response hasn't been very quick or useful, so I thought
> I'd ask here if anyone happened to know.
>
> Hi,
>
> So, I've seen a few Linux panics on 2.6.9-89.mumble (RHEL4) from this
> assert (OpenAFS client is 1.4-based):
>
> Assertion failure in journal_start() at fs/jbd/transaction.c:274:
> "handle->h_transaction->t_journal == journal"
>
> The backtrace shows that this is from libafs trying to write to the
> cache (from afs_GetDownDSlot), via afs_linux_readpage, which is
> triggered from a page fault for a buffer that someone is trying to write
> to an ext3 filesystem (which is separate from the ext3 disk cache fs).
>
> The 'handle->h_transaction->t_journal' in that assert is for the "other"
> ext3 fs, and 'journal' is for the cache ext3 fs (in the panics that I've
> seen).
>
> Based on that, and after reading through some Linux code, is looks like
> ext3 will set current->journal_info, and then try to copy in pages from
> the supplied user buffer, which can trigger libafs. So when we write to
> the ext3 cache, the journal struct we want to use is different than the
> one in the current->journal_info transaction, and ext3 explodes.
>
> Now, the thing is, I can reproduce this same situation on my own box,
> and nothing blows up. From some extra print statements and such, I can
> see that I'm in the afs_GetDownDSlot code path from a page fault, but
> current->journal_info is always NULL by the time afs_linux_readpage gets
> called, so the problem doesn't come up. Which leaves me a bit confused.
>
>
> So, my question here is what is supposed to happen? Is
> current->journal_info supposed to have the journal transaction of the
> current process (in which case I assume the readpage handler is not
> allowed to start write transactions, but I can't find this warned
> against anywhere), or is something supposed to reset the current task's
> journal_info or otherwise somehow guard against this?
>
> There may be some site-specific patches and stuff involved here and I'm
> not trying to debug the panic itself on this list. I'd just like to know
> what is intended to happen in a situation like this, if anyone can
> provide any info.

I don't think it's been an issue before at all, and so it's simply not
been on the radar.
That said, I wonder if the site in question uses btrfs? I seem to
recall there were
some dodgy btrfs patches a while ago which diddled with journal state.

Of course, I could totally be in left field here.

-- 
Derrick