An infrastructure (servers, storage, desktops/workstations/laptops, network,
etc.) is worth nothing in itself, if not providing proper application
response time or QoE. I have since decided to take apart (identify all
components making ~) all critical business apps and then decide what and how
to monitor the underlying systems/processes/components (to the best of my
ability and tools availability) making up the transactions "paths"
end-to-end (I use the term "transaction" loosely - VoIP is among those apps
requiring infrastructure monitoring, of course). I would thus answer your
question by saying that "it depends" completely on what you are supporting
... and here is where I would start:

http://www.netqos.com/resourceroom/whitepapers/forms/handbook.html

***Stefan Mititelu
http://twitter.com/netfortius
http://www.linkedin.com/in/netfortius


On Tue, Oct 13, 2009 at 3:28 PM, Atom Powers <atom.pow...@gmail.com> wrote:

> On Tue, Oct 13, 2009 at 1:19 PM, Rob Cherry <lo...@lxrb.com> wrote:
> > I am in a situation of starting a site from scratch and thus have no
> > historical helpful configurations to build from.  I have just finished
> > implementing a monitoring solution and now I need to tell it what I care
> > about.  There is obvious stuff like availability and response times, but
> > other metrics have become more tricky.
>
> I just put together a simple list for friend. Of course, what you
> monitor depends on what you want to know. In this case, I want to know
> if there is a problem I need to take care of so that I don't need to
> drive in to work an hour later.
> --
> I have centralized remote monitoring for most of my systems. I trigger
> an alert of three consecutive values are out of range (checked every
> 60s); these are triggers that indicate an immediate problem.
> ICMP host unreachable
> CPU process queue, >3
> CPU idle, <10%
> Free Swap space, <10MB
> Free Disk Space, <20%
> Available memory, <10MB
> Network queue, >5
> Network errors, >3
> *process not running
> *process TCP port unreachable
>
> I keep a history of those, plus several more metrics for capacity planning.
> CPU system/user/wait time
> Network traffic in/out/total
> Disk read/write/queue
> number of processes
> CPU temperature
> UPS load
>
>
> --
> Perfection is just a word I use occasionally with mustard.
> --Atom Powers--
>
> _______________________________________________
> Discuss mailing list
> Discuss@lopsa.org
> http://lopsa.org/cgi-bin/mailman/listinfo/discuss
> This list provided by the League of Professional System Administrators
>  http://lopsa.org/
>
_______________________________________________
Discuss mailing list
Discuss@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to