[CSL #248914] [OpenAFS] flock() behavior openafs-1.2.11

David Thompson thomas@cs.wisc.edu
Wed, 01 Dec 2004 13:18:28 -0600


Derrick J Brashear wrote:
>On Fri, 29 Oct 2004, David Thompson wrote:
>
>> The ACLs on the directory give this user read acess but no write.
>> Like I said, the openafs code is the same on both boxes.
>>
>> Has anyone seen anything like this?  Is this a kernel change?
>
>Perchance does one support 64 bit locks (F_GETLK64) and the other doesn't?

(And now we return you to your previous program, in progress...)

So, here's the original flock problem, back after a few week's hiatus:

Box 1 (flock works per the man page):

open("/s/db-apps/lib/GradApp/lib/GradApp.pbk", O_RDONLY|O_LARGEFILE) = 4
ioctl(4, SNDCTL_TMR_TIMEBASE, 0xbfffecb0) = -1 EINVAL (Invalid argument)
_llseek(4, 0, [0], SEEK_CUR)            = 0
fstat64(4, {st_mode=S_IFREG|0644, st_size=9192, ...}) = 0
fcntl64(4, F_SETFD, FD_CLOEXEC)         = 0
flock(4, LOCK_SH)                       = 0

Box 2 (flock doesn't work per the man page):

open("/s/db-apps/lib/GradApp/lib/GradApp.pbk", O_RDONLY|O_LARGEFILE) = 4
ioctl(4, SNDCTL_TMR_TIMEBASE or TCGETS, 0xbfffae08) = -1 EINVAL (Invalid 
argumen
t)
_llseek(4, 0, [0], SEEK_CUR)            = 0
fstat64(4, {st_mode=S_IFREG|0644, st_size=9192, ...}) = 0
fcntl64(4, F_SETFD, FD_CLOEXEC)         = 0
flock(4, LOCK_SH)                       = -1 EACCES (Permission denied)

sys_flock (in fs/locks.c) is significantly different between the two kernels:

Box 1 (works):
asmlinkage long sys_flock(unsigned int fd, unsigned int cmd)
{
        struct file *filp;
        int error, type;

        error = -EBADF;
        filp = fget(fd);
        if (!filp)
                goto out;

        error = flock_translate_cmd(cmd);
        if (error < 0)
                goto out_putf;
        type = error;

        error = -EBADF;
        if ((type != F_UNLCK)
#ifdef MSNFS
                && !(type & LOCK_MAND)
#endif
                && !(filp->f_mode & 3))
                goto out_putf;

        lock_kernel();
        error = flock_lock_file(filp, type,
                                (cmd & (LOCK_UN | LOCK_NB)) ? 0 : 1);
        unlock_kernel();

out_putf:
        fput(filp);
out:
        return error;
}


Box 2 (borked):
asmlinkage long sys_flock(unsigned int fd, unsigned int cmd)
{
        struct file *filp;
        int error, type;

        error = -EBADF;
        filp = fget(fd);
        if (!filp)
                goto out;

        error = flock_translate_cmd(cmd);
        if (error < 0)
                goto out_putf;
        type = error;

        error = -EBADF;
        if ((type != F_UNLCK)
#ifdef MSNFS
                && !(type & LOCK_MAND)
#endif
                && !(filp->f_mode & 3))
                goto out_putf;

        lock_kernel();

        /*
         * Execute any filesystem-specific flock routines.  The filesystem may
         * maintain supplemental locks.  This code allows the supplemental 
locks
         * to be kept in sync with the vfs flock lock.  If flock() is called on
         * a lock already held for the given filp, the current flock lock is
         * dropped before obtaining the requested lock.  This unlock operation
         * must be completed for the any filesystem specific locks and the vfs
         * flock lock before proceeding with obtaining the requested lock.  
When
         * the filesystem routine drops a lock for such a request, it must
         * return -EDEADLK, allowing the vfs lock to be dropped, and the
         * filesystem code is then re-executed to obtain the lock.
         *
         * A non-blocking request that returns EWOULDBLOCK also causes any vfs
         * flock lock to be released, but then returns the error to the caller.
         */
        if (filp->f_op && filp->f_op->lock) {
repeat:
                error = flock_fs_file(filp, type, cmd);

                if (error < 0) {
                        /*
                         * We may have dropped a lock.  We need to
                         * finish unlocking before returning or
                         * continuing with lock acquisition.
                         */
                        if (error != -ENOLCK)
                                flock_lock_file(filp, F_UNLCK, 0);

                        /*
                         * We already held the lock in some mode, and
                         * had to drop filesystem-specific locks before
                         * proceeding.  We come back through this
                         * routine to unlock the vfs flock lock.  Now go
                         * back and try again.  Using EAGAIN as the
                         * error here would be better, but the one valid
                         * error value defined for flock(), EWOULDBLOCK,
                         * is defined as EAGAIN.
                         */
                        if (error == -EDEADLK)
                                goto repeat;

                        goto out_unlock_putf;
                }
        }

        error = flock_lock_file(filp, type,
                                (cmd & (LOCK_UN | LOCK_NB)) ? 0 : 1);

        /*
         * If we failed to get the vfs flock, we need to clean up any
         * filesystem-specific lock state that we previously obtained.
         */
        if (error && filp->f_op && filp->f_op->lock)
                flock_fs_file(filp, F_UNLCK, 0);        

out_unlock_putf:
        unlock_kernel();

out_putf:
        fput(filp);
out:
        return error;
}


Do these differences explain the difference in behavior that I'm seeing?  Is 
my naive reading correct that Box 2 is actually using the afs locking code, 
while Box 1 is not?

Dave