[OpenAFS] Proposed changes for server log rotation

Fri, 3 Dec 2010 17:25:46 -0500

Responding to various notes -

On Dec 2, 2010, at 11:05 PM, Russ Allbery wrote:

> Jeffrey Altman <jaltman@secure-endpoints.com> writes:
>=20
>> My one concern to switching to something like syslog by default is =
that
>> "bos getlog" will need to be re-implemented in a different fashion.
>=20
> Yeah, this is a very good point.  I think I've used bos getlog maybe =
three
> times in the past fifteen years, so I never think about it, but I =
suspect
> others use it more than I do.

If bos getlog/showlog disappearing is the price of getting proper log =
rotation, I'd call it a good deal. By my definition of proper, of =
course. :-)

On Dec 3, 2010, at 10:47 AM, chas williams - CONTRACTOR wrote:

> perhaps the absolute minimum would be to implement a signal that =
causes
> the log files to be closed and reopened just like a restart.  this
> could be issued weekly via bosserver to emulate the restart behavior.
> people want new behavior like syslog, would need opt in and change
> command line params (eventually switch to this as the default).

I would certainly find that sufficient.

IMHO the best solution is a HUP-based one that brings AFS log open/close =
into the same 'standard' as syslog. The HUP would cause them to close =
and reopen/create the log files. That does have the 'side effect' =
previously mentioned of resetting the logging level. IMHO, that's a =
moderately desirable side effect, especially if it also caused the =
servers to re-read the not-yet-having-achieved-existance of the config =
files. If one has bumped up the debug level and needs that bump to =
persist thru a logfile open/close, changing the config file to set the =
new debug level meets that need.

On Dec 3, 2010, at 7:33 AM, Derrick Brashear wrote:

> On Thu, Dec 2, 2010 at 11:05 PM, Russ Allbery <rra@stanford.edu> =
wrote:
>> Jeffrey Altman <jaltman@secure-endpoints.com> writes:
>> . . .
>> Yeah, this is a very good point.  I think I've used bos getlog maybe =
three
>> times in the past fifteen years, so I never think about it, but I =
suspect
>> others use it more than I do.
>=20
> bos salvage -showlog?

Any time I've cared that much about a given salvage, I've been on the =
relevant host tailing the relevant file. It's workarounds like that =
which help mail the loss of those bos log commands worth the gain.

On Dec 3, 2010, at 10:38 AM, Jeffrey Altman wrote:

> . . .  While I believe that
> defaulting to syslog probably is the right way to go, I think we need =
to
> consider the impact on documentation and end user best practices.

If sufficient developer time was available, we would be offering site =
admins various combinations of logging options at start time. But I =
don't see that much developer time available. IMHO the biggest win for =
the least work is to pick a signal which, when sent to  a server, causes =
it to close and re-open the log files. I'll expand below on why this =
might be the best solution as well.

On Dec 3, 2010, at 10:38 AM, Jeffrey Altman wrote:

> At the very least I believe that we need to implement an internal log
> rollover mechanism based on either max size or max time with a max
> number of old logs to be maintained.  Implementing this should not be
> overly complicated if we agree that this internal file logging is not
> meant to replace full featured log rotation tool chains.

Sorry, Jeff, but I must disagree strongly with 'with a max number of old =
logs to be maintained.' That's the sort of thing that's best managed =
with an external tool to do those sorts of renames, what naming strategy =
to use, etc.

I even mildly disagree with 'internal log rollover mechanism based on =
either max size or max time.' My desire for better AFS logfile rotation =
isn't based on a problem with them getting too big. Mine is partly based =
on AFS' predeliction for renaming files to '.old' and thereby wiping out =
log files when you get multiple restarts close together, and partly on =
the inability to sanely rotate and archive those files without having to =
restart AFS (and yes, I now see that some of that can be done =
semi-automagically). Having the individual files get too big is simply =
not an issue for us. If others' milage varies, they should speak up. But =
in general, all the right tools exist (logfile watchers and renamers) to =
do that job. I don't think we should try and build such into AFS.

In summary, all the really significant problems with AFS log files are =
solved by a close/reopen on HUP or other signal we choose. It's an =
simple solution, it dovetails very nicely with existing tools and with =
relatively simple cron scripts, "it's the UNIX way", and it's probably =
easy to implement. It doesn't break 'bos *log*', either. It's a win.

The problem with .old logs getting overwritten when a restart occurs is =
one I can fix in cron venues. Mind you, with log open/close on hup I see =
no need for the 'feature' of renaming FooLog to FooLog.old on any =
restart. Instead we should just keep appending to the  existing logfile =
if it's already there.

Steve=