There has been an ongoing thread about uptime graphing vs alerting and I wanted to throw in my two cents. (Disclaimer: My opinions - not necessarily that of my employer) SA is a very very good tool for alerting on specific failures. For performance monitoring, we use MRTG. We plot CPU loads, Disk Queue lengths, bandwidth, etc for all our key servers and network equipment. This is very good "Capacity" data for planning and troubleshooting. We don't have anything that easily provides "SLA-type" data like "% up time". I think the problem with using a tool like SA to get that data would be the granularity. Our cycles run every 5 minutes and we have some checks (like disk space free) that we only check hourly. This would be poor resolution for "up time" reporting. Yes, I wish SA could produce it, since its a tool I like to work with, but if push comes to shove, I'd rather have Woodstone work opening up the configuration file (XML ???!?!?) so alerts could be programmatically created and updated than develop a tool to do up time reporting. That would keep the focus on what SA does well: alerting. =tv= -Tom
To unsubscribe send a message with UNSUBSCRIBE as subject to [email protected] If you use auto-responders (like out-of-the-office messages), then make sure that they are not send to the list nor to the individual members of the list that send a message. Doing this will get you removed from the list.
