On Fri, Feb 13, 2009 at 12:09 PM, David Collier-Brown <dav...@sun.com> wrote:
> Brendan Gregg wrote:
>> No.  Stop.  Do not assume any data is better than no data.  Wrong
>> or misleading data is *worse* than no data.
>
>  I agree most emphatically.  Most metrics recorded by programs
> are those needed by the authors of the programs. All too
> few are the ones we need to *use* the program.
>
>  Solaris is good at collecting, for example, queue occupancy
> and depth instead of just depth, but we can do more. And
> the fishwork guys are the ones are doing it, for storage.
>
>  I therefor suggest we *do* do more, Brendan's alternative B:
>> make an effort to export useful performance metrics, to meet
>> stated needs.  Examine what's there and keep what is good (I think
>> "iostat -xne" output is great), drop what's bad (some of vmstat),
>> and add what is missing - which means adding kstats to the kernel.
>
> --dave (an evil contractor and capacity planner in Toronto) c-b

I'm all for defining better metrics.  Please join in the discussions
on perfmib-dev at opensolaris.org and add what things you think are
useful.  The choice of the *stat commands (really more certain bits of
data that correspond to that seen by the mpstat & fsstat commands is
all that has been discussused) is that at least some of the data they
present has proven useful, and the metrics they're collecting are
unlikely to go away anytime soon, thus any arguments that the cross
call rate for a cpu metric is unstable and might go away tomorrow
(thus shouldn't ever be used by anything) are a bit harder to argue.

The problem is that there seems to be this mortal fear that after
going through all the effort to create these nice means of obtaining
the data from the OS (kstat, dtrace, etc.) that that someone (who
isn't a Sun employee) will somewhere, somehow actually use them.
Anytime anyone suggests trying to use the data in someway that might
prove useful, they are told 'that data is not stable, you cannot use
it'.  The only 'blessed (by Sun)' means of obtaining data from the
system are the *stat commands, so those are in effect the only things
we can arguably work with without getting shot down by stability
arguments, but then there's another group that says 'but those metrics
are bad! don't use those!'

If one listens to both groups, we are left with doing nothing and
hoping that at some point in the future Sun decides to do something.
The existing commands are apparently inadequate, and Sun will shoot
down any attempt to make anything better (if they're not the ones
doing it) because the sources of the data are not deemed stable (in an
interface sense).
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to