[OpenAFS-devel] Re: pthreading the bosserver

Benjamin Kaduk kaduk@MIT.EDU
Thu, 8 Aug 2013 16:54:51 -0400 (EDT)


On Thu, 8 Aug 2013, Andrew Deason wrote:

> On Thu, 8 Aug 2013 15:40:08 -0400 (EDT)
> Benjamin Kaduk <kaduk@MIT.EDU> wrote:
>
>> First off: do we need to keep an LWP version of the bosserver around
>> as well as a pthreaded one?  I don't think so, and I believe Simon
>> agrees, but it would be good to get consensus.
>
> It does not need to stay very long; lwp fileserver has already been
> removed. If you're asking if you can get rid of the LWP bosserver at the
> same time as introducing a pthreaded bosserver, I think that depends on
> how sure you are that it functions correctly. I would vote for a tbozo
> directory, but if the changes are not complex and you're verify
> confident, it may not be necessary. But I think it's easier to implement
> a tbozo, and then remove bozo (and move tbozo into it) when it's just as
> good.

I guess part of the reason I was pushing for skipping the "separate tbozo" 
step with my implementation is that it was a pretty straightforward 
translation of the old code to pthreaded primitives.  I guess you're 
right, and I would be less comfortable skipping the tbozo step with the 
extra changes in the earlier patchset.

>> Second, how strong of an integrity guarantee do we need for the bos
>> config?  My understanding is that configuration changes (adding or
>> removing or en/disabling bnodes) are rare events, and it is highly
>> unlikely that multiple administrator connnections changing things will
>> be made concurrently.
>
> We can assume they are infrequent, but we must assume that they will
> happen. That is, there needs to be locking, but it doesn't need to be
> very granular. That is, it can be slow, but it cannot cause something to
> break or behave weirdly.

Hmm, okay.  I think that the current LWP code and my tree do not have any 
explicit locking when writing to bnode fields; Chas has a per-bnode lock 
which protects that state.  I don't remember offhand whether the global 
lock in my tree can safely be extended to cover such bnode state.  I guess 
this makes my current CV usage a bit bogus, as it sleeps on the global 
mutex but is not necessarily using it to protect anything.

>> Relatedly, is it okay to assume that shutdown/restart/etc. will not be
>> issued concurrently with config changes?  A "fully correct"
>> implementation would seem to need to only shutdown/restart the bnodes
>> which were configured when the command was issued, and ignore any new
>> nodes created since then.  Because the implementation of
>> shutdown/restart must drop locks, making this guarantee seems to
>> require additional sychronization effort, whether via a temporary
>> queue to store the bnodes being acted upon, or a higher-level lock.
>
> Are you talking about a 'bos create' racing with a 'bos restart -all'? I
> would think you'd block out all modifications during a restart. While
> the ordering may not matter for 'bos restart -all', it may matter for
> 'bos restart -bosserver', just so it doesn't leave behind a running
> process and then re-exec itself or something.

Well, that's the use case that I backsolved to from reading the code; 
maybe Chas had something else in mind.
Basically, in things like WaitAll, the code takes the global locks and 
holds it while copying the allBnode queue to a temporary queue.  It then 
releases the "main" global lock and holds the "tempq" global lock across 
the blocking operations that make up the wait in the WaitAll.  The main 
global lock can't be held like that, as holding it while blocking would 
tie up all action.  My intuition is telling me that a flag variable 
protected by the global lock could offer the same protection with less 
code complexity, but I didn't try to implement such a scheme or analyze it 
fully.

>> I haven't been able to convince myself that the additional complexity
>> of the extra watcher threads is necessary, but if someone else could
>> convince me, that would be good.
>
> My opinion is that we should explicitly drop LINUX24 support on servers
> (or at least tbozo, if we eventually provide both tbozo and bozo).  I
> have never heard of demand for LINUX24 servers, and it's easy to migrate
> off of them. The thing I have heard demand for and is not easy to
> migrate off of is LINUX24 clients, which we could still keep.
>
> I mean, regardless of what solution we end up with, how much testing is
> anyone really going to do for bozo on LINUX24? We're just going to end
> up with something that theoretically works but we're not very confident
> has solved various possible race conditions or whatnot. If we want to
> keep LINUX24 for this, we should at least put a big warning on it that
> mentions something involving the relevant issues.

I'm not going to argue with this proposal :)

> That doesn't deal with any signalling specifics, but keep in mind our
> current bozo signal handling is not always great, and does not
> necessarily need to be fixed at the same time. I've always seen bozo
> misidentify core dumps, which I thought was due to this, but I've never
> really cared.

I think the really important thing is to not miss any SIGCHLD events. 
Misidentifying core dumps is not as important.



I also failed to note that Chas fixed up the bnode reference counting, 
which is totally bogus in master.  I had deferred that to a follow-up 
commit, which is not currently implemented.  (It should not be hard to 
do.)

-Ben