On 4/27/10 2:15 PM, "Justin Lloyd" <jll...@digitalglobe.com> wrote: > This is a follow-on to my original thread about creating a Solaris SMF > service. Since I'm no longer doing that and have decided to let Zenoss > do that, I was curious about what others are doing along these lines. A > couple of things came to mind as I was mulling over how to do this. > > Using SNMP to monitor cf-execd processes is probably the best way, > except for the caveat about which I just learned that the SNMP MIB ends > with the processes' PIDs and Cfengine by default restarts itself at 5 > AM, which would lead to unnecessary alerts. Is that restart necessary > and, if so, what's a good way to handle monitoring cf-execd?
I'm sure it is common knowledge... But if you do want to use SNMP, I've had better luck with this sort of thing when I bundle the heavy "is it OK" logic in a local script which runs and dumps state to some /tmp/file, then use snmpd.conf extends to cat the file (always returns quickly). > Zenoss could also restart cf-execd if it's been down for some specified > amount of time. Event handlers all the way. > Also, I figured I should probably have Zenoss also monitor cf-serverd > and cf-monitord, even though Cfengine already monitors and will restart > them. My thought here was in case something really gets broken and > either or both of those two do not start up correctly. So I figured > maybe only alert if they're down for more than 20 or 30 seconds. Anyone > dealing with this? That's exactly what we do. We actually have daemon tools restart cf daemons on our servers if they die, and monitoring also watches for missing processes (something went really wrong) but has an extra-long-threshold so people aren't paged during "normal" or at least self-healing restarts. _______________________________________________ Help-cfengine mailing list Help-cfengine@cfengine.org https://cfengine.org/mailman/listinfo/help-cfengine