2010/2/9 Deb Heller-Evans <d...@es.net>

> Justin,
>
> I see where you're going here - that you want to alert on unkept promises.
> But I am sure that like many here on this list, I receive hundreds if not
> thousands of emails per day that are already filtered and sorted with often
> times more information than I can or want to process.  Alternatively, you
> could log the condition to a file, rather than send an alert email, and some
> parsing function could periodically alert you to the negative status.
>
> What you're describing might be a good tickler for a Nagios alert
> condition. We've found the alert mechanisms in Nagios to scale well over
> hundreds of systems, without the necessity of email floods. Haven't yet
> coupled Nagios with Cfengine, but it's on my horizon.
>
>
> Kind Regards,
> deb
>
> Deb Heller-Evans               1 Cyclotron Road
> Computer Systems Engineer      Berkeley, CA 94720
> ESnet  http://www.es.net/      Desk: 510/495-2243
>
>
> On Fri, 5 Feb 2010 10:55:18 -0700, Justin Lloyd wrote:
> > Hi all,
> >
> > I’ve opened a ticket on this but I wanted to share my thoughts with
> > the community to see if anyone has had the same thought and perhaps
> > has already implemented something to this effect.
> >
> > I’d like for Cfengine on each host to be able to send an email every
> > time it tries to repair a promise, whether or not it is successful.
>

All we need is for cfengine to *log* the fact that a promise repair failed.
That would be sufficient, as then cf-execd would notice the output of
cf-agent differs from the previous run, and will email the output as per
it's normal behavior. This of course assumes you are running cf-execd, and
you are NOT running cf-agent with --inform, in which case the output will
always be different.


> > Maybe something as simple as this:
> >
> > body agent control {
> >     repair_email_address => "cfengine-repa...@mycompany.com";
> >     # perhaps some additional tunable parameters, e.g.
> >     #  report_on => { "repaired" | "not_kept" | "any" };
> >     #  include_error => { "true" | "false" };
> >     #  success_subject_prefix => "[nova promise repaired] ";
> >     #  failure_subject_prefix => "[nova promise not kept] ";
> >     #  etc.
> > }
> >
> > This would allow for a more real-time view of the Cfengine
> > environment, by enabling each host to send an email  with repair
> > success or failure, promise handle, any relevant error message, etc.
> > For example, this could help detect repairs immediately, especially
> > if the same system keeps repairing the same thing or multiple systems
> > are performing the same repair, indicating a fundamental root cause
> > that requires administrator intervention.
> >
> > IMHO (if anyone thinks this opinion is misguided please say so),
> > Cfengine shouldn’t have to repair anything in a properly functioning
> > environment and, if it does, then something needs investigating. It
> > may just be someone manually changing a file’s permissions and
> > Cfengine is correcting them (which may mean user/admin training is
> > required). This philosophy does assume, however, that promises are
> > written in a way that they will only make corrections when necessary.
> >
> > For example, if I have a promise to ensure that a Solaris system’s
> > hostname is in /etc/nodename, I should write the promise so that it
> > doesn’t do anything if the file is correct, rather than just
> > recreating the correct file every time the agent runs, regardless of
> > whether the file’s contents are already correct.
> >
> > Any thoughts or comments on this?
> >
> > Thanks,
> > Justin
> >
> > This electronic communication and any attachments may contain
> > confidential and proprietary
> > information of DigitalGlobe, Inc. If you are not the intended
> > recipient, or an agent or employee
> > responsible for delivering this communication to the intended
> > recipient, or if you have received
> > this communication in error, please do not print, copy, retransmit,
> > disseminate or
> > otherwise use the information. Please indicate to the sender that you
> > have received this
> > communication in error, and delete the copy you received.
> > DigitalGlobe reserves the
> > right to monitor any electronic communication sent or received by its
> > employees, agents
> > or representatives.
> >
> > _______________________________________________
> > Help-cfengine mailing list
> > Help-cfengine@cfengine.org
> > https://cfengine.org/mailman/listinfo/help-cfengine
> _______________________________________________
> Help-cfengine mailing list
> Help-cfengine@cfengine.org
> https://cfengine.org/mailman/listinfo/help-cfengine
>
_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to