I've worked with tons of programs that do this - none of them have filled all my requirements. One thing, though - I didin't want to rely on SNMP - My ultimate solution would be a system that will ssh to a machine and run commands, process the output.
In no particular order: 1. Monit is good for status - not history. If you want to know what's running NOW, or get en email when disk usage goes above 80% - things that can be checked remotely (ping, HTTP/S, Mysql) then monit is great. No pretty graphs, though. 2. Nagios is the best for all-around monitoring - but configuration is a pain. Installing the Groundwork framework makes this a cinch, though installing groundwork itself is slightly painful and pretty invasive on the host machine. If you have a machine to dedicate to this, this is probably the best solution for most cases. 3. mrtg is good, especially if all your devices speak SNMP. It's cute an simple - you cron a perl script to run every, say, 5 minutes. It outputs a bunch of graphs. I don't think it does warnings/emails, etc. 4. Orcallator. Some large companies I know use this. I tried it once - I seem to remeber that I liked it, though I couldn't online any of the features that I thought I liked :-) 5. Zabbix - Very robust, but requires their agents to be installed on all non SMTP machines. If you are willing to do this, then this is a Great option. Really - Zabbix is a mature system that works great. 6. Zenoss - compatible with nagios plugins. I would say this solution reminds me of a mach-up of zabbix and nagios. 7. Cacti - Non recommended. I don't remember why. On 8/9/07, Shachar Shemesh <[EMAIL PROTECTED]> wrote: > Oren Held wrote: > > A friend of mine (Amnon) found Munin (http://munin.projects.linpro.no/), > > which > > is a great system resource grapher tool which has plugins for almost > > everything from swap, ntp time drifts, disk temperature - to mysql queries > > per second. > > > > However, it draws graphs, I'm not sure it can send alerts. You can set > > limits > > (like highest cpu temperature or free disk space) which tag the whole node > > as "red", maybe it's even capable of notifying.. worth a check I guess. > > > I don't know about alerts, but I find that it is the best tool I know > for getting the "general health" of a system. That is something no graph > specific test can tell you, because it often involves measurements you > did not think of before they happened. > > For anyone who is interested in seeing it in action, Hamakor's new > server has it running, open for all to see: http://hamakor.org.il/munin/ > > Shachar > > ================================================================= > To unsubscribe, send mail to [EMAIL PROTECTED] with > the word "unsubscribe" in the message body, e.g., run the command > echo unsubscribe | mail [EMAIL PROTECTED] > > ================================================================= To unsubscribe, send mail to [EMAIL PROTECTED] with the word "unsubscribe" in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]