[OpenAFS] Re: Starting an server (both DB and FS) without `BOS` (e.g. on Linux with systemd)

Ciprian Dorin Craciun ciprian.craciun@gmail.com
Sun, 10 Mar 2019 13:07:57 +0200


On Sat, Mar 9, 2019 at 11:16 PM Jeffrey Altman <jaltman@auristor.com> wrote:
> The BOS Overseer Service plays a number of roles:


Just wanted to stress that `bos` is wonderful in a distributed
deployment, and I'm quite surprised that until this date we don't have
other "general purpose" alternatives.

However as stated in the previous email I'm using OpenAFS in a home /
small office environment, where I'll never have more than one server.
Moreover the deployment will in the end be done in a dedicated VM.
Thus the need of `bos` seems to be superfluous.



> 2. The bosserver is responsible for managing the content of many
>    configuration files including BosConfig, UserList, and
>    the server version of the CellServDB file.  The KeyFile can
>    also be updated via bosserver.  The files other than BosConfig
>    are shared with the AFS services.


These files are configured one-time only, and from what I gather (and
experimented) can easily be created by hand without the `bos`
toolchain.  (Perhaps only the `KeyFile` requires `bos` commands, but
does not require the `bos` daemon to be running.)



>    c. fs - a bnode which defines the process group for [...]
>
>    d. dafs - a bnode which defines the process group for the
>       demand attach fileserver.  The bosserver has special knowledge
>       related to process restart in case of failure and integration
>       with the "bos salvage" command.
>
> 3. The bosserver is used to request manual salvages of individual
>    volumes or whole partitions.  When the "fs" bnode is in use,
>    the bnode will be stopped and started while the salvage takes
>    place.  With the "dafs" bnode, single volume salvages do not
>    require the "dafs" bnode to be halted but full partition
>    salvages do.
>
> [...]
>
> > Does the `fileserver` / `dafileserver` actually start the salvage
> > process, or do they communicate this to the `bos` to restart only that
> > service?
>
> Most but not all of these functions could be performed with other tools.
>  Managing the special inter-dependencies of the "fs" and "dafs" bnode
> processes and salvaging are the two exceptions.


And this is where things get "opaque", and the documentation doesn't
give much internal details.

When you say <<the bosserver has special knowledge related to process
restart in case of failure and integration with the "bos salvage"
command>>, by "failure" you mean "the `fileserver` process just dies",
or the `fileserver` process somewhat "signals" this to the `bos`
server?

Because what I gather from what you say, a simplified file server
startup might look like:
* run `salvager` / `dasalvager` and wait for it to terminate;
* run `volserver` / `davolserver` and parallel,
* run `fileserver` / `dafileserver` and,
* if any of the volume or file servers fail, stop them and restart
from the first step;

Thanks,
Ciprian.