Re: [perf-discuss] Improved Performance MIB for OpenSolaris - proposal

Brendan Gregg - Sun Microsystems Wed, 11 Feb 2009 14:57:35 -0800

On Tue, Feb 10, 2009 at 11:56:10PM -0600, Jason King wrote:
> On Tue, Feb 10, 2009 at 11:08 PM, Brendan Gregg - Sun Microsystems
> <bren...@sun.com> wrote:
> > G'Day Folks,
> >
> > On Tue, Feb 10, 2009 at 08:03:17PM +0000, Peter Tribble wrote:
> > [...]
> >> Create a net-snmp module that exposes well known Solaris performance
> >> metrics via SNMP.  If possible, this will include presenting kstat
> >> metrics in a  generic fashion via SNMP.
[...]
> >
> > ... and if we start with what's needed instead of what Solaris provides, 
> > then
> > we may have a generic enough performance MIB to port to other systems. :)
> 
> Well what I wanted to start with is the data presented by the *stat
> commands (vmstat, mpstat, etc.).  In most cases, they are just showing
> the difference between a number of kstats over a chosen time interval,
> which just happens to make the implementation a bit easier.   Or to
> think of it another way, for the initial piece at least, the interface
> is the same metrics seen using the *stat commands (so to speak), the
> fact kstats are used in obtaining the numbers is an implementation
> detail.
> 
> I think with that would address most of the stability concerns.
> 
> If (as was suggested) we add the ability to present kstats in a more
> generic fashion (I think that would be in addition to the above
> piece), it would need to be in a way that if new kstats are added, or
> old ones deleted, the MIB would not need to be updated.  I think
> everyone here knows that kstats are subject to change without notice.
> However if that's all that's available at the time, a working 'wrong'
> solution is better than a non-existant 'right' solution.


Why would the right solution be non-existant?  It's not hard to add kstats.
The world of performance has too many wrong solutions - it confuses customers
and can lead to purchases based on bad information.  For a recent example,
read Bryan's article on the commonly requested SPEC SFS benchmark:
http://blogs.sun.com/bmc/entry/eulogy_for_a_benchmark

> As it is, the initial impetus for this was trying to do a basic
> compare box A to box B for work, to be able to do at least rudimentary
> evaluation for consolidation for zones.  This means looking at
> historic data.  Today the only bundled option is to parse the sar
> data.  That is painful for a number of reasons (group that admins box
> B has sar collecting over different intervals, just manipulating the
> data in general is rather annoying and time comsuming, etc.)  Going
> forward, one could write a bunch of custom scripts to run vmstat,
> mpstat, etc. and write them to a log or a database or a central
> server, or one could just avoid reinventing the wheel and just make
> them available via snmp.  One round wheel is as good as another, so I
> don't feel the need to make another one :)

This is touching on a different issue - yes, we need a better performance
archive solution than sar.  Fishworks has bundled a kstat/DTrace based
one called Analytics in the new storage products (which outright kills
any need for sar), although that doesn't help us on [Open]Solaris right now.

This may be an opportunity to create new and more useful perf statistics
that we export via kstat and SNMP - which I think has more value than
reheating what's already there.  Consider the following:

 $ sysperfstat 1
             ------ Utilisation ------     ------ Saturation ------
     Time    %CPU   %Mem  %Disk   %Net     CPU    Mem   Disk    Net
 23:07:10    0.85  44.11   2.40   0.19    0.01   0.00   0.00   0.00
 23:07:11    7.00  95.17   0.00   0.00    0.00   0.00   0.00   0.00
 23:07:12    4.00  95.63   0.00   0.00    0.00   0.00   0.00   0.00
 23:07:13    5.00  96.09   0.00   0.00    0.00   0.00   0.00   0.00
 23:07:14    5.00  96.55   0.00   0.00    0.00   0.00   0.00   0.00
 23:07:15    5.00  97.01   0.00   0.00    0.00   0.00   0.00   0.00
 23:07:16    6.00  97.47   0.00   0.00    0.00   0.00   0.00   0.00
 23:07:17    5.00  97.92   0.00   0.00    0.00   0.00   0.00   0.00
 23:07:18    9.00  97.84   2.00   0.00    0.00  20.51   0.04   0.00
 23:07:19    6.00  97.92   2.75   0.00    0.00  13.04   0.04   0.00
 23:07:20    6.00  97.91   2.85   0.00    0.00  18.22   0.04   0.00
 [...]

I wrote this as a solution to the problem of system wide observability (and
in this case, one that fits in an 80-char wide format - SNMP dosn't have that
restriction, so should serve a better and more detailed selection of
statistics.)  Serving out vmstat style metrics without addressing why is a
solution in search of a problem - and a solution that's about 25 years old.

...

But, if you are wedded to the idea of re-serving vmstat and what not, I'd make
that clear in the MIB - that this is the SNMP view of vmstat etc - which
hopefully doesn't confuse anyone more than the existing tools.  Customers can
also use existing documentation to understand vmstat's ancient metrics.  And
so, 'perfmib' may be a bad name - it's not the best perf MIB we could possibly
do; perhaps 'perftoolsmib' - as the SNMP view of common perf tools...  Which
I'd agree does have real value. :)

Brendan

-- 
Brendan Gregg, Sun Microsystems Fishworks.    http://blogs.sun.com/brendan
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Re: [perf-discuss] Improved Performance MIB for OpenSolaris - proposal

Reply via email to