[OpenAFS-devel] Re: pthreading the bosserver

Benjamin Kaduk kaduk@MIT.EDU
Thu, 8 Aug 2013 19:54:26 -0400 (EDT)


On Thu, 8 Aug 2013, Chas Williams (CONTRACTOR) wrote:

> In message <alpine.GSO.1.10.1308081643340.24720@multics.mit.edu>,Benjamin Kaduk writes:
>> Hmm, okay.  I think that the current LWP code and my tree do not have any
>> explicit locking when writing to bnode fields; Chas has a per-bnode lock
>> which protects that state.  I don't remember offhand whether the global
>> lock in my tree can safely be extended to cover such bnode state.  I guess
>> this makes my current CV usage a bit bogus, as it sleeps on the global
>> mutex but is not necessarily using it to protect anything.
>
> I just took the traditional approach to solving the problem.  Instead of
> relying on LWP's cooperative nature, I just used locks where they probably
> should have been originally.

Definitely.  I am just not sure that the bosserver needs quite the 
fine-grained locking that you did (i.e., maybe you put in more effort than 
is needed).  There shouldn't be anything incorrect about it.
I was trying to approach from the other side ("do as little as possible"), 
and seem to have put in not quite enough work, given the discussion here.

>>> Are you talking about a 'bos create' racing with a 'bos restart -all'? I
>>> would think you'd block out all modifications during a restart. While
>>> the ordering may not matter for 'bos restart -all', it may matter for
>>> 'bos restart -bosserver', just so it doesn't leave behind a running
>>> process and then re-exec itself or something.
>>
>> Well, that's the use case that I backsolved to from reading the code;
>> maybe Chas had something else in mind.
>> Basically, in things like WaitAll, the code takes the global locks and
>> holds it while copying the allBnode queue to a temporary queue.  It then
>> releases the "main" global lock and holds the "tempq" global lock across
>> the blocking operations that make up the wait in the WaitAll.  The main
>> global lock can't be held like that, as holding it while blocking would
>> tie up all action.  My intuition is telling me that a flag variable
>> protected by the global lock could offer the same protection with less
>> code complexity, but I didn't try to implement such a scheme or analyze it
>> fully.
>
> For the restart cases ('restart -all' and 'restart -bosserver') the user
> (or bosserver itself) should probably be prevented from creating new
> bnodes since the bnode list shouldn't grow while the various shutdown
> procs are being applied.
>
> That would be easy enough to fix with a flag/lock.

Well, I think your tempBnodes_lock is performing this function, or nearly 
this function.  I am not convinced that the extra tempBnodes queue is 
necessary, though; I think we can just have a flag and skip the extra 
queue and lock.

>> I think the really important thing is to not miss any SIGCHLD events.
>> Misidentifying core dumps is not as important.
>
> This is why you have a thread calling waitpid() for each child.  You don't
> need to catch SIGCHLD.
>
>> I also failed to note that Chas fixed up the bnode reference counting,
>> which is totally bogus in master.  I had deferred that to a follow-up
>> commit, which is not currently implemented.  (It should not be hard to
>> do.)
>
> It was important to get this right so I didn't need to hold the locks
> as often.  The original code was mostly correct but benefited greatly
> from LWP's cooperative nature.

Interesting, that's not quite as I would have expected.  The reference 
count is serving only to keep the bnode from being deleted, I thought, not 
as a write barrier.  So I would not have thought the locking would add a 
substantial overhead.  And yes, "totally bogus" was a bit overblown, it 
was mostly okay but have a few places that were quite wrong.

I'll keep looking at the windows stuff (maybe I just need to #include 
<winsock2.h> and it will just work?) and per-bnode locks a bit more, but 
at the moment, my leaning is that the easiest way forward would be to take 
your code and replace the tempBnodes queue with the global flag mentioned 
above.  Would you be okay if I went and did that?  This would leave us 
with a tbozo, and we could get rid of the LWP copy after a bit of time.

-Ben