On Fri, Feb 13, 2009 at 12:09 PM, David Collier-Brown <dav...@sun.com> wrote: > Brendan Gregg wrote: >> No. Stop. Do not assume any data is better than no data. Wrong >> or misleading data is *worse* than no data. > > I agree most emphatically. Most metrics recorded by programs > are those needed by the authors of the programs. All too > few are the ones we need to *use* the program. > > Solaris is good at collecting, for example, queue occupancy > and depth instead of just depth, but we can do more. And > the fishwork guys are the ones are doing it, for storage. > > I therefor suggest we *do* do more, Brendan's alternative B: >> make an effort to export useful performance metrics, to meet >> stated needs. Examine what's there and keep what is good (I think >> "iostat -xne" output is great), drop what's bad (some of vmstat), >> and add what is missing - which means adding kstats to the kernel. > > --dave (an evil contractor and capacity planner in Toronto) c-b
I'm all for defining better metrics. Please join in the discussions on perfmib-dev at opensolaris.org and add what things you think are useful. The choice of the *stat commands (really more certain bits of data that correspond to that seen by the mpstat & fsstat commands is all that has been discussused) is that at least some of the data they present has proven useful, and the metrics they're collecting are unlikely to go away anytime soon, thus any arguments that the cross call rate for a cpu metric is unstable and might go away tomorrow (thus shouldn't ever be used by anything) are a bit harder to argue. The problem is that there seems to be this mortal fear that after going through all the effort to create these nice means of obtaining the data from the OS (kstat, dtrace, etc.) that that someone (who isn't a Sun employee) will somewhere, somehow actually use them. Anytime anyone suggests trying to use the data in someway that might prove useful, they are told 'that data is not stable, you cannot use it'. The only 'blessed (by Sun)' means of obtaining data from the system are the *stat commands, so those are in effect the only things we can arguably work with without getting shot down by stability arguments, but then there's another group that says 'but those metrics are bad! don't use those!' If one listens to both groups, we are left with doing nothing and hoping that at some point in the future Sun decides to do something. The existing commands are apparently inadequate, and Sun will shoot down any attempt to make anything better (if they're not the ones doing it) because the sources of the data are not deemed stable (in an interface sense). _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org