[OpenAFS] Re: volserver crashing

Derrick Brashear shadow@gmail.com
Tue, 12 Apr 2011 13:04:38 -0400

On Tue, Apr 12, 2011 at 1:01 PM, Andrew Deason <adeason@sinenomine.net> wro=
> On Tue, 12 Apr 2011 12:42:12 -0400
> Derrick Brashear <shadow@gmail.com> wrote:
>> > Dumb question: =A0Would I have to restart bosserver? =A0Can I do so
>> > without being disruptive (i.e. restarting the fileserver process) to
>> > my users?
>> the problem is, the ulimit applies to "me and my future children" so
>> you'd need to effectively cause the current running bosserver to ask
>> to modify its ulimit for future fileservers, or ask the current
>> fileserver to do so for itself. so, not short of binary patching at
>> runtime.
> But it's the volserver we want to change the limits for.

truth, and the answer here is wretched but...

> Normally, yes,
> you'd stop bosserver completely and start it up again.
> But technically, you could run a different volserver with a different
> ulimit outside of bosserver. If you _really_ want to avoid restarting
> the fileserver, you can:
> $ cd /usr/afs/bin
> $ mv volserver volserver.real
> $ cat <<EOF > volserver
>> #/bin/sh
>> sleep 1000000
>> exit 0
>> EOF
> $ chmod a+x volserver
> $ kill -TERM <volserver pid>
> Which will kill the existing volserver, and bosserver will restart it by
> running that no-op shell script. So you are free to run your own
> volserver in whatever environment. Something like:

you don't even need to do that. have the shell script ulimit the
volserver and just let it run the real volserver.

which is similarly wretched.

> $ ulimit -c unlimited
> $ ./volserver.real -nojumbo
> And make sure you
> $ mv volserver.real volserver
> after starting the volserver, so bosserver can start the real volserver
> "normally" again. After you get a core, just kill the fake volserver
> process, and bosserver will restart the real volserver.
> This is all unsupported stuff, and I should tell you to not do this in
> production, etc etc. But if you wanted to know how to do this without
> interrupting the fileserver, there you go. If you're not sure of what
> you're doing, you're probably better of just restarting bosserver
> entirely (after setting ulimit), which does involve stopping and
> starting the fileserver.