[OpenAFS-devel] volserver hangs possible fix
Ted Anderson
TedAnderson@mindspring.com
Mon, 18 Apr 2005 08:43:55 -0400
On 4/12/2005 12:04, Jeffrey Hutzelman wrote:
> On Tuesday, April 12, 2005 11:11:29 AM -0400 Ted Anderson
> <TedAnderson@mindspring.com> wrote:
>
>> Just to make sure I understand this problem, let me try providing a
>> specific scenario:
>> thread A thread B
>> -------- --------
>> 1: lock M;
>> 2: check queue;
>> 3: cond_wait(C,M); // queue is empty
>> 4: -> kernel_scheduler {
>> 5: unlock M;
>> 6: lock M; ...
>> 7: add item; ...
>> 8: unlock M; ...
>> 9: broadcast on C; ... // noop, nobody blocked on C
>> 10: put B on C;
>> 11: desched B;
>> 12: DEADLOCK
>>
>> Is this a problem that can occur with a corrently operating
>> implementation of POSIX threads, i.e. an example of correct but
>> non-deterministic scheduling? I've expanded the operation of cond_wait
>> as I understand it. I observe that M correctly protects the queue
>> itself. Treating "broadcast C" and "put B on C" as black boxes, however,
>> it doesn't seem to help to move broadcast(C) inside M, i.e. by swapping 8
>> & 9. On the other hand, tt would seem a simple matter to reverse the
>> sequence of 5 & 10, but perhaps that is inconvenient for some
>> multi-processor schedulers.
>>
>> As I understand your comment, you are saying that the broadcast must be
>> protected by the mutex or race conditions are possible. What am I
>> missing here?
>
>
> What you're missing here is that cond_wait() is atomic. The lines you
> list as 5 and 10 happen at the same time, so it's not possible for
> thread A to acquire the lock before thread B is on the wait-list for the
> CV. This is why it is necessary to hold the lock through the broadcast
> in order to prevent a race -- otherwise, the broadcast could indeed
> occur between lines 2 and 10, and thread B would miss the wakeup.
Well, I always thought cond_wait() was atomic. But if that is true then
I don't see why broadcast() needs to be covered by the mutex. In the
above example (ignoring the kernel_scheduler stuff), in thread A's
execution the broadcast (9) follows the locked item addition (6-8). So,
either thread B goes first, gets added to the CV's queue and then gets
awoken, or thread A goes first and there the item is added to the queue
and B won't sleep. How does moving broadcast() inside the mutex change
anything?
What I was responding to way this:
On 4/11/2005 16:26, Horst Birthelmer wrote:
> On Apr 11, 2005, at 9:21 PM, Tom Keiser wrote:
>> It is a race condition. pthread_cond_wait() is not an atomic
>> operation; it relies on the mutex to synchronize a set of operations
>> that enqueue the calling thread onto the list of threads blocked on
>> the cond var, drop the associated lock, and become blocked by the
>> kernel scheduler. If you call signal/broadcast without holding the
>> lock, you can race against these bookkeeping operations in
>> pthread_cond_wait(). Thus, a thread in the process of blocking will
>> never find out about that signal/broadcast, because cond vars are not
>> stateful like semaphores.
>
> The mutex passed to a condition variable isn't for guarding the
> bookkeeping, but that's another story, I guess.
It seemed that both Horst and Tom agreed that broadcast() should be
protected, though neither was clear on the reason. The quotes from
POSIX didn't clarify it for me. So what I tried to do was break out the
sub-operations that must compose cond_wait() to try to see how
protecting broadcast() with the mutex would help, but I couldn't
construct a failing case.
Maybe one of you could outline a scenario where broadcast() must be
protected to avoid a race condition?
Ted Anderson