[OpenAFS-devel] locking fairness on Linux?

Wed, 4 May 2005 16:19:40 -0400

(Linux 2.6.11, OpenAFS 1.3.80)

I've mumbled about this in the past (PAG stuff) and I ran into it
again with the microbenchmarking I've been doing on the client.  I'm
doing the same experiment as before for the dentry stuff:

take some set s of files 1..n
permute s
outer loop 1..k:
  for f in s:
    stat(f)

In this instance, I'm running with n=1024 and -stat 100; every stat()
should trigger a FetchStatus.

If I run 10 of these concurrently, I see the following:

10*1024 in 20.990940 -> 0.002050 sec/1    process 1
10*1024 in 42.003866 -> 0.004102 sec/1    process 2
10*1024 in 63.042780 -> 0.006157 sec/1            3
10*1024 in 84.053341 -> 0.008208 sec/1
10*1024 in 105.084001 -> 0.010262 sec/1
10*1024 in 126.089251 -> 0.012313 sec/1
10*1024 in 147.138416 -> 0.014369 sec/1
10*1024 in 168.136637 -> 0.016420 sec/1
10*1024 in 189.177636 -> 0.018474 sec/1
10*1024 in 210.189635 -> 0.020526 sec/1

Not only is it serializing but as far as I can tell, the first one
runs exclusively until it finishes, then the second and so on.  

If I run 2 in strace, whenever I stop one, the other one starts going;
if I CONT the STOPed one, it doesn't make forward progress.  The starving
one makes forward progress "occassionally," presumably when the
running one stats the same one that the starved one has outstanding.

Looking around, it sounds like the Linux struct semaphore that GLOCK
is implemented with is supposed to be fair... so I'm not sure what's
going on here.

Even weirder, if I run them in separate directories (separate sets of
files), they make forward progress concurrently.  Until I saw this, I
thought it was GLOCK fairness problem but this seems to rule that out.

So it sounds there's some kind of directory (vnode?) -level locking
that has the characteristic that if you re-lock something in the same
timeslice that you last released it, then you always win.

john