[OpenAFS] namei interface lockf buggy on Solaris (and probably HP-UX and AIX)
Mon, 11 Sep 2006 12:45:40 -0400
I propose we move this discussion to -devel.
On 9/11/06, Rainer Toebbicke <email@example.com> wrote:
> The namei interface uses file locking extensively, implemented using
> lockf() on Solaris, AIX & HP-UX.
> Unfortunately lockf() locks and unlocks from the *current position* to
> whatever the argument says (end of file), moving the file pointer in
> between becomes a problem for the subsequent unlock! The result is
> that frequently locks aren't released, but replaced by partial locks
> on the file data just moved over.
At least on AIX and Solaris, lockf() is nothing more than an
inflexible wrapper around fcntl() byte-range locks. My vote is to
transition to fcntl (where we can explicitly pass in a base offset and
length). This eliminates the call semantics change introduced by your
patch, and eliminates the unnecessary syscall overhead. I further
object because I'm working on a patch which will allow us to use
pread/pwrite on platforms which support it. This will completely
eliminate fcntl(F_DUPFD,...) and lseek() overhead in the fd package,
so any new requirements on lseek could mitigate the performance
improvement I'm seeing. However, the real motivation for switching to
pread/pwrite is due to a fairly serious locking bug:
As it turns out, the way we use file locks in the volume package is
quite broken. The spec says that once a process closes *any* file
descriptor, all fcntl locks held for that file are immediately
destroyed. This means that the pthread fileserver/volserver can have
some interesting races given how the ih package fd cache allows
multiple concurrent descriptors per inode handle. I have sample code
sitting around somewhere which demonstrates this fault.