[OpenAFS] cache performance

Tue, 29 Oct 2002 16:18:59 -0500

give it a trygive it a trygive it a try
Phil.Moore@morganstanley.com wrote:
> This is very interesting indeed, but we're way too diverse to impose a
> execution mechanism on our environment, especially with the specter of
> 30,000 Windows boxes all just waiting to finally have a stable
> distributed filesystem out of which to run applications.

I don't quite follow you.  The overhead of having an application send a 
UDP datagram on startup is not a particularly onerous imposition. And it 
doesn't require any reconfiguration of the client.

> We've tried similar approaches to yours in the past, with varying
> degrees of success, but I think our scale makes such an approach
> impractical.

The scale penalty would be that you might drop some of the datagrams, 
which means at worst your logs may not reflect every single invocation. 
I doubt it would happen very often though, even at the scale you're 
talking about.  It's a pretty low overhead mechanism.

> The strategic focus *MUST* be on getting richer statistics out of the
> fileservers, so we can perform this analysis centrally.

Maybe I wasn't clear, but the logging happens centrally, wherever you 
choose to run the runloggerd daemon, so analysis is central as well.  I 
agree some interesting info could be gleaned from the fileservers, but 
you can put runlogger into just the apps/pkgs you are interested in and 
get very focused logs to analyze. (Or do what we do and log everything 
you can get your fingers into.)

> Then again, if you have a mature logging mechanism like this, it
> certainly would complement anything we can gather on the servers, too.

You're welcome to give it a try.  I've cleaned up the code a bit and 
packaged it up for public consumption.  Point a browser at

    http://www.unc.edu/~utoddl/runlogger.2.tgz

and give it a smoke test.

As always, I'd welcome any comments or suggestions for improvements you 
or anyone else might have.
-- 
    +----------------------------------------------------------------+
   / Todd_Lewis@unc.edu                  http://www.unc.edu/~utoddl /
  /(919) 962-5273  Linux - It's now safe to turn on your computer. /
+----------------------------------------------------------------+

>>>>>>"Todd" == Todd M Lewis <utoddl@email.unc.edu> writes:
>>>>>
> 
> Todd> You might be interested in what we've done in this area.  We're an 
> Todd> academic shop (U. of North Carolina - Chapel Hill), so our needs are 
> Todd> admittedly different from yours, but we build a bunch of packages from 
> Todd> source, usually for as many of our supported architectures as we can get 
> Todd> 'em to build on.  We wanted to know who's using what, so we would know 
> Todd> how to spend our limited people resources when deciding what to upgrade, 
> Todd> what versions to abandon, etc.
> 
> Todd> We came up with a mechanism called runlogger.  Basically, we stick a 
> Todd> call to the runlogger client function somewhere near the beginning of a 
> Todd> program when we build it. If that's not practical (if it's a script 
> Todd> based thing for example) we have it call the stand-alone runlogger 
> Todd> client program, and if it comes to it and we really want it logged badly 
> Todd> enough, we'll wrap the application in a script that runs the runlogger 
> Todd> client before running the program in question.
> 
> Todd> The runlogger client takes one parameter -- the name of the package we 
> Todd> want to log.  If we need finer grained logging (a pkg might contain 
> Todd> several different programs for example), then it could pass the pkg 
> Todd> name, a colon, and the program name as one parameter.  Runlogger takes 
> Todd> this parameter and concatenates the uid of the user (which is usually 
> Todd> who he/she's klogged as) and the AFS @sys name for this architecture 
> Todd> (which was hard coded into the runlogger routine at build time) into a 
> Todd> colon delimited string and passes it off via UDP to the runloggerd 
> Todd> daemon indicated in the runlogger pkg's config file.
> 
> Todd> runloggerd takes this steady stream of UDP packets from all these 
> Todd> different clients, adds to them a time stamp and the IP address of 
> Todd> client, and appends them onto its log file.  You get things that look 
> Todd> like this (w/ numbers changed to protect the innocent):
> 
> 
>>>2002.05.29.13.32.00 [152.2.1.103]:[rs_aix43]:[5678]:pine-421
>>>2002.05.29.13.32.00 [152.2.1.149]:[sun4x_57]:[0]:lynx-284
>>>2002.05.29.13.32.19 [152.2.1.104]:[rs_aix43]:[5847]:pine-421
>>>2002.05.29.13.32.20 [152.2.68.144]:[sun4x_58]:[26678]:pine-421
>>>2002.05.29.13.32.32 [152.2.1.106]:[rs_aix43]:[9491]:pine-421
>>>2002.05.29.13.32.32 [152.2.1.99]:[rs_aix43]:[3190]:openssh-252p2
>>>2002.05.29.13.32.33 [152.2.48.55]:[sgi_65]:[6309]:tcsh-611
>>
> 
> Todd> That's a time stamp, the client IP, the @sys name, uid, and pkg name.
> 
> Todd> We routinely analyze the log file to see what's being run, when, by 
> Todd> whom, and on what architecture(s).  You can try to log everything, or 
> Todd> limit it to only logging those things you're interested it at the moment.
> 
> Todd> We've made a variation of it called pmlogger which lets us see which 
> Todd> Perl modules are actually being used.  (Perl module life cycling can be 
> Todd> a real pain, and it's a lot easier to drop support for an old module 
> Todd> when you know it isn't being used by anybody.)
> 
> Todd> I'm sure the file servers could give us other interesting information, 
> Todd> but the runlogger/runloggerd approach has given us good results without 
> Todd> having to change the production servers.  It adds a little overhead to 
> Todd> each logged program's startup, but not much.  If you interested, I could 
> Todd> package it up and make it presentable...