[OpenAFS-devel] [CSL #307050] afs_xsetgroups32 patch i386_linux26

David Thompson thomas@cs.wisc.edu
Fri, 18 Aug 2006 16:03:24 -0500


OK...I'm stumped.  Consider a new version of afs_xsetgroups32 
(afs/LINUX/osi_groups.c) to set a pag using the setgroups() system calls (an 
identical patch is applied to afs_xsetgroups...):

afs_xsetgroups32(int gidsetsize, gid_t * grouplist)
{
    long code;
    cred_t *cr;
    afs_uint32 junk;
    int pag;

    if (gidsetsize > 1 && 
        (pag = afs_get_pag_from_groups(grouplist[0], grouplist[1]))
          != NOPAG) {
        code = (*sys_setgroups32p) (gidsetsize-2, grouplist+2);
    } else {
        lock_kernel();
        cr = crref();
        pag = PagInCred(cr);
        crfree(cr);
        unlock_kernel();
        code = (*sys_setgroups32p) (gidsetsize, grouplist);
    }

    if (code) {
        return code;
    }

    if (pag != NOPAG) {
        /* install pag if there's room. */
        lock_kernel();
        cr = crref();
        code = setpag(&cr, pag, &junk, 0);
        crfree(cr);
        unlock_kernel();
    }

    /* Linux syscall ABI returns errno as negative */
    return (-code);
}

The new form of the function works correctly for sp and smp kernels, but crashes for hugemem kernels (see below).

Is there anyone who would be willing to help debug this?  I'm a little confused as to why both afs_xsetgroups and afs_xsetgroups32 appear in the stack trace below...

Thanks much.

Dave Thompson
UW-Madison

PS -- Here's a (sample) kernel crash:

Aug 15 16:40:45 clover kernel: Unable to handle kernel paging request at
virtual address fef3b5b4
Aug 15 16:40:45 clover kernel: Unable to handle kernel paging request at
virtual address fef3b5b4
Aug 15 16:40:45 clover kernel: printing eip:
Aug 15 16:40:45 clover kernel: printing eip:
Aug 15 16:40:45 clover kernel: f8b49d62
Aug 15 16:40:45 clover kernel: *pde = 00000000
Aug 15 16:40:45 clover kernel: *pde = 00000000
Aug 15 16:40:45 clover kernel: Oops: 0000 [#1]
Aug 15 16:40:45 clover kernel: Oops: 0000 [#1]
Aug 15 16:40:45 clover kernel: SMP
Aug 15 16:40:45 clover kernel: Modules linked in: md5 ipv6 libafs(U)
iptable_filter ip_tables xfs_quota(U) xfs(U) dm_mirror dm_mod button
battery ac uhci_hcd ehci_hcd shpchp e1000 floppy ext3 jbd
 3w_9xxx(U) sd_mod scsi_mod
Aug 15 16:40:45 clover kernel: CPU: 1
Aug 15 16:40:45 clover kernel: EIP: 0060:[<f8b49d62>] Tainted: PF VLI
Aug 15 16:40:45 clover kernel: EFLAGS: 00010206 (2.6.9-34.0.2.ELhugemem)
Aug 15 16:40:45 clover kernel: EIP is at
afs_xsetgroups32+0x11/0x117 [libafs]
Aug 15 16:40:45 clover kernel: eax: 000000ce ebx: d087efc4 ecx: fef3b5b0
edx: 003bd000
Aug 15 16:40:45 clover kernel: esi: fef3c904 edi: fef3c890 ebp: fef3b5b0
esp: d087efa8
Aug 15 16:40:45 clover kernel: ds: 007b es: 007b ss: 0068
Aug 15 16:40:45 clover kernel: Process quickxcgi (pid: 2963,
threadinfo=d087e000 task=cba500b0)
Aug 15 16:40:45 clover kernel: Stack: 000000ce 000000ce d087efc4
fef3c904 fef3c890 d087e000 fffec220 00000004
Aug 15 16:40:45 clover kernel: fef3b5b0 009b9ff4 fef3c904 fef3c890
fef3b588 000000ce 0000007b 0000007b
Aug 15 16:40:45 clover kernel: 000000ce 008777a2 00000073 00000212
fef3b580 0000007b
Aug 15 16:40:45 clover kernel: Call Trace:
Aug 15 16:40:45 clover kernel: Code: <3>Debug: sleeping function called
from invalid context at include/linux/rwsem.h:43
Aug 15 16:40:45 clover kernel: in_atomic():0[expected: 0],
irqs_disabled():1
Aug 15 16:40:45 clover kernel: [<0211ffc0>] __might_sleep+0x7d/0x89
Aug 15 16:40:45 clover kernel: [<02154914>] rw_vm+0xe4/0x29c
Aug 15 16:40:45 clover kernel: [<f8b49d37>] afs_xsetgroups+0xfa/0x114 [libafs]
Aug 15 16:40:45 clover kernel: [<f8b49d37>] afs_xsetgroups+0xfa/0x114 [libafs]
Aug 15 16:40:45 clover kernel: [<02154d8b>] get_user_size+0x30/0x57
Aug 15 16:40:45 clover kernel: [<f8b49d37>] afs_xsetgroups+0xfa/0x114 [libafs]
Aug 15 16:40:45 clover kernel: [<021061ab>] show_registers+0x115/0x16c
Aug 15 16:40:45 clover kernel: [<02106342>] die+0xdb/0x16b
Aug 15 16:40:45 clover kernel: [<021227cc>] vprintk+0x136/0x14a
Aug 15 16:40:45 clover kernel: [<0211b11a>] do_page_fault+0x421/0x5f7
Aug 15 16:40:45 clover kernel: [<f8b49d62>] afs_xsetgroups32+0x11/0x117 [libafs]
Aug 15 16:40:45 clover kernel: [<0214282a>] __pagevec_free+0x15/0x1a
Aug 15 16:40:45 clover kernel: [<0214723b>] release_pages+0x13b/0x143
Aug 15 16:40:45 clover kernel: [<0211acf9>] do_page_fault+0x0/0x5f7
Aug 15 16:40:45 clover kernel: [<0215007b>] move_vma+0x15a/0x185
Aug 15 16:40:45 clover kernel: [<f8b49d62>]
afs_xsetgroups32+0x11/0x117 [libafs]
Aug 15 16:40:45 clover kernel: Bad EIP value.
Aug 15 16:40:45 clover kernel: <0>Fatal exception: panic in 5 seconds