[OpenAFS-devel] Thundering herds and the vnode state machine

Matt W. Benjamin matt@linuxbox.com
Tue, 21 Feb 2012 13:27:37 -0500 (EST)


Hi,

Responding non-specifically wrt DAFS, it might be worth mentioning that, at the other extreme from broadcast, more or less, is the concept of an explicit wait queue (and generalization queues, with or without priority).  There's an infrastructure for explicit wait queues in the XCB implementation.

Regards,

Matt

----- "Simon Wilkinson" <simonxwilkinson@gmail.com> wrote:

> 
> It's these broadcasts that cause us problems on multi-processor
> systems. Firstly, we broadcast regardless of the state change that has
> just occurred. If we have gone into an exclusive state, then we're
> waking up a load of threads that will be unable to make any progress.
> Secondly, broadcasting wakes up all pending threads, but the volume
> global lock means that only one can make progress. If the one that
> wins this race requires exclusive access, then all of the other woken
> threads will in turn acquire the global lock, note that they can't
> gain access to the vnode, and go back to sleep again. On a contended
> system, this will lead to a huge number of false wakeups. Thirdly,
> there are some situations where we broadcast multiple times for a
> single state change.
> 
> I think any solution to this would require threads to indicate what
> they are going to do once they have waited. This allow us to
> selectively wake threads requiring exclusive access but broadcast to
> threads requiring read access. These wakeups would then only be
> performed if the state that we have transitioned in to would allow
> those threads to make forward progress.
> 
> I'd welcome input from others more familiar with this code as to
> whether this is actually a problem, or if I'm missing something with
> the pthread condvar implementation that mitigates the problem.
> 
> Cheers,
> 
> Simon.
> 
> _______________________________________________
> OpenAFS-devel mailing list
> OpenAFS-devel@openafs.org
> https://lists.openafs.org/mailman/listinfo/openafs-devel

-- 
Matt Benjamin
The Linux Box
206 South Fifth Ave. Suite 150
Ann Arbor, MI  48104

http://linuxbox.com

tel. 734-761-4689
fax. 734-769-8938
cel. 734-216-5309