[OpenAFS] cache performance

Phil.Moore@morganstanley.com Phil.Moore@morganstanley.com
Fri, 25 Oct 2002 14:33:01 -0400


This is very interesting indeed, but we're way too diverse to impose a
execution mechanism on our environment, especially with the specter of
30,000 Windows boxes all just waiting to finally have a stable
distributed filesystem out of which to run applications.

We've tried similar approaches to yours in the past, with varying
degrees of success, but I think our scale makes such an approach
impractical.

The strategic focus *MUST* be on getting richer statistics out of the
fileservers, so we can perform this analysis centrally.

Then again, if you have a mature logging mechanism like this, it
certainly would complement anything we can gather on the servers, too.

>>>>> "Todd" == Todd M Lewis <utoddl@email.unc.edu> writes:

Todd> You might be interested in what we've done in this area.  We're an 
Todd> academic shop (U. of North Carolina - Chapel Hill), so our needs are 
Todd> admittedly different from yours, but we build a bunch of packages from 
Todd> source, usually for as many of our supported architectures as we can get 
Todd> 'em to build on.  We wanted to know who's using what, so we would know 
Todd> how to spend our limited people resources when deciding what to upgrade, 
Todd> what versions to abandon, etc.

Todd> We came up with a mechanism called runlogger.  Basically, we stick a 
Todd> call to the runlogger client function somewhere near the beginning of a 
Todd> program when we build it. If that's not practical (if it's a script 
Todd> based thing for example) we have it call the stand-alone runlogger 
Todd> client program, and if it comes to it and we really want it logged badly 
Todd> enough, we'll wrap the application in a script that runs the runlogger 
Todd> client before running the program in question.

Todd> The runlogger client takes one parameter -- the name of the package we 
Todd> want to log.  If we need finer grained logging (a pkg might contain 
Todd> several different programs for example), then it could pass the pkg 
Todd> name, a colon, and the program name as one parameter.  Runlogger takes 
Todd> this parameter and concatenates the uid of the user (which is usually 
Todd> who he/she's klogged as) and the AFS @sys name for this architecture 
Todd> (which was hard coded into the runlogger routine at build time) into a 
Todd> colon delimited string and passes it off via UDP to the runloggerd 
Todd> daemon indicated in the runlogger pkg's config file.

Todd> runloggerd takes this steady stream of UDP packets from all these 
Todd> different clients, adds to them a time stamp and the IP address of 
Todd> client, and appends them onto its log file.  You get things that look 
Todd> like this (w/ numbers changed to protect the innocent):

>> 2002.05.29.13.32.00 [152.2.1.103]:[rs_aix43]:[5678]:pine-421
>> 2002.05.29.13.32.00 [152.2.1.149]:[sun4x_57]:[0]:lynx-284
>> 2002.05.29.13.32.19 [152.2.1.104]:[rs_aix43]:[5847]:pine-421
>> 2002.05.29.13.32.20 [152.2.68.144]:[sun4x_58]:[26678]:pine-421
>> 2002.05.29.13.32.32 [152.2.1.106]:[rs_aix43]:[9491]:pine-421
>> 2002.05.29.13.32.32 [152.2.1.99]:[rs_aix43]:[3190]:openssh-252p2
>> 2002.05.29.13.32.33 [152.2.48.55]:[sgi_65]:[6309]:tcsh-611

Todd> That's a time stamp, the client IP, the @sys name, uid, and pkg name.

Todd> We routinely analyze the log file to see what's being run, when, by 
Todd> whom, and on what architecture(s).  You can try to log everything, or 
Todd> limit it to only logging those things you're interested it at the moment.

Todd> We've made a variation of it called pmlogger which lets us see which 
Todd> Perl modules are actually being used.  (Perl module life cycling can be 
Todd> a real pain, and it's a lot easier to drop support for an old module 
Todd> when you know it isn't being used by anybody.)

Todd> I'm sure the file servers could give us other interesting information, 
Todd> but the runlogger/runloggerd approach has given us good results without 
Todd> having to change the production servers.  It adds a little overhead to 
Todd> each logged program's startup, but not much.  If you interested, I could 
Todd> package it up and make it presentable...