[OpenAFS-devel] OpenAFS Development

Jack Neely jjneely@pams.ncsu.edu
Thu, 24 Jun 2004 14:43:29 -0400


--cWoXeonUoKmBZSoM
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Folks,

At North Carolina State University we are quite heavily dependent on AFS
and our usage of Red Hat Enterprise Linux and Fedora Core is growing by
leaps and bounds.  I've recently discovered several things that I think
need some serious attention.

Before I start please refrain from all "they hate us" and "that's just a
bitch to fix" replies.  I'm tired of it and it stifles development.  Would
you want to work on something that you just got told is "just a bitch?"
In return I do offer my services as limited as they are.

The three things I want to mention are PAGs with the 2.6 kernel,
caching, and NPTL.  The reason I'm concerned about the first is obvious.
Caching has been a complaint I have heard the most about AFS.  It's slow.
That's okay, I don't expect it to be like local disk access.  But both
these issues have been slammed home (in my mind at least) by Arla.  I
checked out and poked around yesterdays (6/23) CVS snapshot of Arla and
this is what I found.

My test box is a 2.2GHz laptop with 512MB of RAM running Fedora Core 2
with the 2.6.6-1.435.  Arla CVS built, installed, and worked right out
of the box.  PAGs.  I had PAGs.  In fact, Arla will try to hook into the
LSM first, and failing that, hooks the syscall table.  For our
convenience, I have attached the code that finds sys_call_table.  

I still think that hooking the syscall table is pretty evil and that
there needs to be something in the stock kernel that can generically
handle some sort of authentication key per process.  Recently, there was
a very productive conversation on LKML about a "key-ring" patch between
David Howells and Kyle Moffett.  A link to it was posted here after the
first few messages, but now a rather complex and generic system has been
laid out.  I assumed the OpenAFS folks would be very interested in this
and give some feedback, but there's been silence.  I've spoken with Kyle
myself and he hopes to have a bit of the patch written during this
weekend.  This will be a much cleaner method that we can use to store
and utilize PAGs.  I hope this works out for us.  You can google for the
archive, the subject is "In-kernel Authentication Tokens (PAGs)".

While I was reading the change logs and NEWS about Arla, I discovered
that they had done quite a bit of improvements to their caching layer.
IIRC, the caching code in OpenAFS loses its efficiency after about
120MB.  The default cache size for Arla is 1.4GB.  I thought I would do
some very rough testing.

C1 = 2.2GHz laptop with 512MB of RAM running Fedora Core 2 with the 
     2.6.6-1.435 kernel and yesterday's CVS of Arla
C2 = 2.8GHz desktop with 512MB of RAM running Red Hat Linux 9 kernel 
     2.4.20-24.9 with OpenAFS 1.2.10.

I ran 'time md5sum' on a 661MB ISO on local disk:

C1 = 25.932 seconds
C2 = 22.549 seconds

Next, 'time cp' copying the ISO to an OpenAFS fileserver 1.2.11.

C1 = 2m57.643s
C2 = 5m1.363s

Finally, 'time md5sum' again on the file in AFS.

C1 = 22.645s
C2 = 2m53.667s

The idea was to get a rough idea how efficient it was to write from the
VFS cache layer to the AFS cache layer and how fast it was to read from
the AFS cache layer.  Arla's times were...impressive.  I ran the md5sum
on local disk and on AFS multiple times to actually believe that is was
faster to read from Arla's cache, than the local disk.

Finally, NPTL.  We are starting to look at deploying OpenAFS servers to
replace Transarc code.  What's involved in looking at the NPTL issues in
OpenAFS?  These need to be fixed, not worked around.  The workaround
will probably go away soon.  One of my next tasks is to look into this
more closely.

In conclusion, I'm very blown away by Arla.  OpenAFS has always worked a
little better for us as a client but now the tables are turned.  Both
projects are Open Source, would a look at some of this code help us?

Jack Neely

-- 
Jack Neely <slack@quackmaster.net>
Realm Linux Administration and Development
PAMS Computer Operations at NC State University
GPG Fingerprint: 1917 5AC1 E828 9337 7AA4  EA6B 213B 765F 3B6A 5B89

--cWoXeonUoKmBZSoM
Content-Type: text/plain; charset=utf-8
Content-Disposition: attachment; filename="nnpfs_syscalls-lossage.c"
Content-Transfer-Encoding: 8bit

/*
 * Copyright (c) 2003-2004 Kungliga Tekniska H�gskolan
 * (Royal Institute of Technology, Stockholm, Sweden).
 * All rights reserved.
 * 
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 * 
 * 3. Neither the name of the Institute nor the names of its contributors
 *    may be used to endorse or promote products derived from this software
 *    without specific prior written permission.
 * 
 * Alternatively, this software may be distributed under the terms of the
 * GNU General Public License ("GPL").
 * 
 * THIS SOFTWARE IS PROVIDED BY THE INSTITUTE AND CONTRIBUTORS ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL THE INSTITUTE OR CONTRIBUTORS BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 */

/* 
 * Orignally written for OpenAFS by Chaskiel Grundman <cg2v@andrew.cmu.edu>,
 * mudged somewhat by Love <lha@it.su.se>.
 */

#define __NO_VERSION__

#include <nnpfs/nnpfs_locl.h>
#include <nnpfs/nnpfs_debug.h>
#include <nnpfs/nnpfs_syscalls.h>
#include <linux/sched.h>
#include <linux/unistd.h>

#ifdef RCSID
RCSID("$Id: nnpfs_syscalls-lossage.c,v 1.9 2004/06/21 21:48:55 tol Exp $");
#endif

#ifndef HAVE_KERNEL_SYS_CALL_TABLE

#ifdef LINUX2_5
#include <linux/kallsyms.h>
static const void *lower_bound = &kernel_thread;
#else
#include <asm/pgtable.h>
static const void *lower_bound = &empty_zero_page;
#endif

nnpfs_sys_call_function *sys_call_table;

#if 0
static inline int 
kallsym_is_equal(unsigned long addr, const char *name)
{
    char namebuf[128];
    const char *retname;
    unsigned long size, offset;
    char *modname;

    retname = kallsyms_lookup(addr, &size, &offset, &modname, namebuf);
    if (retname != NULL	&& strcmp(name, retname) == 0 && offset == 0)
        return 1;

    return 0;
}
#endif

static inline int looks_good(void **p)
{
    if (*p <= (void*)lower_bound || *p >= (void*)p)
	return 0;
    return 1;
}

int
nnpfs_fixup_syscall_lossage(void)
{
    void **ptr = (void **)&init_mm;
    void **limit;

    sys_call_table = NULL;

    for (limit = ptr + 16 * 1024;
	 ptr < limit && sys_call_table == NULL; ptr++)
    {
	int ok = 1;
	int i;

	for (i = 0; i < 250; i++)
	    if (!looks_good(ptr + i)) {
		ok = 0;
		ptr = ptr + i;
		break;
	    }

	if (ok) {
	    if (ptr[__NR_break] != ptr[__NR_ftime])
		continue;
	    sys_call_table = (nnpfs_sys_call_function *)ptr;
	    break;
	}
    }

#ifndef LINUX2_5
    if ((ptr[__NR_close] != (nnpfs_sys_call_function)&sys_close)
	|| (ptr[__NR_chdir] != (nnpfs_sys_call_function)&sys_chdir))
	sys_call_table = NULL;
#endif /* !LINUX2_5 */

    if (sys_call_table == NULL) {
	printk("Failed to find address of sys_call_table\n");
 	return -EIO;
    }

    printk("Found sys_call_table at %p\n", sys_call_table);

    return 0;
}

#endif /* !HAVE_KERNEL_SYS_CALL_TABLE */

--cWoXeonUoKmBZSoM--