Rob Cherry wrote:
> I am in a situation of starting a site from scratch and thus have no 
> historical helpful configurations to build from.  I have just finished 
> implementing a monitoring solution and now I need to tell it what I 
> care about.  There is obvious stuff like availability and response 
> times, but other metrics have become more tricky.
>
> My historical knowledge all seems increasingly irrelevant too.  For 
> example, checking free memory on a modern Solaris 10 box makes no 
> sense - all the memory is "stolen" by the kernel for aggressive ZFS 
> caching and given up when applications need it. 
>
> Given this and other considerations, I will kick off a list of what i 
> intend to monitor, but I would be very curious to know what everyone 
> else is doing and whether they agree/disagree with the list -
>
>
> Solaris/Linux
>  - / % usage
>  - /tmp % usage
>  - swap % usage
>  - CPU load
>  - Overall response times for various services such as http/ssh etc.
>
> Windows
>  - %SYSTEMROOT% % free
>  - memory % free
>  - CPU % free
>  - Response times on well known ports
>
> Cisco/Networking equipment
>  - Internal temperatures
>  - CPU/Memory utilization
>
> UPS
>  - Output load average
>  - Battery % Charge
>  - Minutes remaining / am i on battery
>
> Any glaring omissions?
>
> Rob
maybe not glaring, but if you're collecting host stats, I'd put a plug 
in for memory paging stats (page-in, page-out). If you're worried about 
swap usage on solaris, you might as well look for desparation free (de) 
in memory stats. Anything above 0 is considered bad. Also scan rate (sr).

cisco networking gear - why not collect interface stats like in/out 
octets and/or line usage (cisco has a private mib for that), also drops 
and errors per interface


_______________________________________________
Discuss mailing list
Discuss@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to