Re: [perf-discuss] capacity planning tools - status

David Collier-Brown Mon, 19 Jan 2009 13:02:11 -0800

  The classic tools for doing capacity planning measure
operation/transaction rate and time taken, rather
than resource usage.


  Once you've *done* the planning, then you correlate
that with the amount of resources you used and use
that to predict when you need to add more resources.

  The three programs which do the CP part are Teamquest
Model, BMC Perform/Predict and Neil Gunther's PDQ. The
first two are for engineers, the third for mathies.

  It all connects to two diagrams, sketched here in
ascii (I hope they don't get munged by the listserv)

Throughput
^            -----------------
|           /
|          /
|         /
|        /
|       /
|      /
|     /
|    /
|   /
|  /
+----------------------> Load, TPS

Response Time (S)
^                   /
|                  /
|                 /
|                /
|               /
|              /
|             /
|            /
|___________/
|
+----------------------> Load, TPS
 0         N*


At some load, called "N*", usage of some resource
hits 100%, throughput stops rising and processes
have to wait in a queue for service.  After
that load, the response time goes up like a rocket,
and most people say "the system has hit the wall".

If you collect demand (load) and plot it against
response time on a device-per-device basis, you can
find N*, and identify the device that bottlenecked,
so you can figure out ho much more you need to
reach as particular load without hitting the wall.

CPU is easy: it's already reported as a percentage of
the maximum, so you can see if it's the problem.
Disk and network I/O aren't as easy, and, as
Murphy would have it, disk i/o is what most
programs bottleneck on. (Darn that Murphy!)

For papers and books on this, start with the
Teamquest site and view their webinar
http://teamquest.com/resources/webinars/display/30/index.htm
and if you don't mind math, follow up with Neil
Gunther's site, http://www.perfdynamics.com/

Finally, if you'll only ever have resource information,
see John Allspaw's O'Reilly book, "The Art of Capacity
Planning", which only cheats the tiniest bit by using
Linux I/O wait (service) time as the metric.

--dave

Stefan Parvu <stefanparv...@yahoo.com> wrote:
> Solaris has a rich number of tools/APIs in order to help to put
> together a Capacity Planning model of your servers/site:
> 
> - /proc, process statistics 
> - Kernel Statistics: used by a majority of userland tools to obtain:
> 1. CPU/Cores utilization (Adrian Cockcroft might disagree with this [1], 
> however still remains important to collect and follow along with your 
> application's throughput and response time)
> 2. Memory utilization
> 3. Disk utilisation
> 4. Network utilization
> 
> All these 4 metrics gives you an idea whats going on more or less with your
> system(s), having faith in kernel developers that wont play and will keep 
> consistent
> the kstat interface over time (e.g.: if per cpu data changes you can say 
> goodbye to your capacity plan). 
> All these hand in hand with the throughput and response times of your 
> applications
> are an important part of any capacity planning model.
> 
> We have lots of tools in (Open)Solaris but we need to better integrate them to
> help anyone interested in building a Capacity Planning model for their Solaris
> systems. Couple of things missing:
> 
> - corestat (should be part of the ON).
> 
> - sysperfstat (combined CPU, Mem, Disk, Net utilization/saturation). If not
> integrated this, Sun should have a similar tool.
> 
> - nicrec (important to measure the bandwidth capacity of your server)
> 
> - say you run IP exclusive, each local zone having its own TCP/IP stack. How 
> can you obtain from global zone the kstat numbers (currEstab,...) 
> for each TCP/UDP module of all local zones ?
> 
> - zone utilization: ready tools to measure the utilisation of each container 
> deployed
> in the global zone.
> 
> - is there any paper(s) related with all current tools and how they can be 
> used to
> develop a simple Capacity Planning model ?
> 
> For fun I combined some tools together and some I modified in order to help 
> my life as a sysadmin [2]. 
> http://www.nbl.fi/~nbl97/solaris/perf/index.html
> 
> Comments ?
> 
> thanks,
> Stefan
> 
> [1] - Utilization is Virtually Useless as a Metric! Adrian Cockcroft ? eBay 
> Research Labs
> [2] - (SE Toolkit could have been easily used here, however not all sites 
> allow installing extra software adding new kernel drivers)
> -- This message posted from opensolaris.org 


-- 
David Collier-Brown            | Always do right. This will gratify
Sun Microsystems, Toronto      | some people and astonish the rest
dav...@sun.com                 |                      -- Mark Twain
cell: (647) 833-9377, bridge: (877) 385-4099 code: 506 9191#
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Re: [perf-discuss] capacity planning tools - status

Reply via email to