On 10.02.2014 21:09, Carlos Cesario wrote:
Good feature!!!
I tested and its working !!!!
Thanks for testing.
The result is
Icinga 2 Cluster Problem: 1 Endpoints (icinga-node-2) not connected.
One question.
Why the all services in icinga-node-2 remain "ONLINE" ? Shouldn't
these services switch to offline too?!
I'm not sure what you mean with the term "online" and "offline".
Depending on the check authority, those checks being executed on the
secondary node will stay in the same state as before and once the
cluster connection is re-established, the check history will be
synchronized from b->a again.
If you got any better idea feel free to propose/discuss. One of our
ideas, which does not really work, was the following:
a ----------------------X---------------> b
freshness triggers normal check
result is stale, not-ok check result, history
.... ....
<-------connection re-established----->
history out-of-sync history-out-of-sync
So that way won't work very well unless you don't are about somewhat
mixed/merged history and other strange effects.
A different approach could be a special state type (or field) for
clustering the service, and based on its authority compared to the
current cluster state, it may tell that the current result is stale
because the node is down (but that would rather be a ui feature then).
Though, that only works if there are authorities used for check
execution on specific nodes. If there's simple check distribution in
place, removing a cluster node with a checker feature enabled will make
the other nodes re-calculate the check distribution (the "magic hash
algorithm") until that specific nodes comes back online.
Best regards,
Em 10-02-2014 13:17, Michael Friedrich escreveu:
Hi,
Icinga 2 0.0.8 targets cluster & configuration finalization. Therefore
the current snapshot builds contain a simple cluster check which will
turn critical once one or more nodes go away.
It's an internal check method provided as check command by the ITL (a
package upgrade is required to latest snapshot builds).
http://docs.icinga.org/icinga2/snapshot/#cluster-health-check
My two test nodes are icinga2a (config master) and icinga2b (checker).
By killing off the remote node icinga2b, the documentation example
check will switch to critical. The 'authorities' attribute will make
sure that the service check is only executed on node icinga2a.
object Host "icinga2a" inherits "generic-host" {
services["cluster"] = {
templates = [ "generic-service" ],
check_interval = 1m,
check_command = "cluster",
authorities = [ "icinga2a" ]
},
}
You'll also recognize that the 'icinga' self stats check contains more
performance data values (*execution_time, states counters, etc) in
order to satisfy the ordinary icingastats output performance graphers.
http://docs.icinga.org/icinga2/snapshot/#itl-icinga
Have fun playing with Icinga 2 :)
Carlos
--
DI (FH) Michael Friedrich
michael.friedr...@gmail.com || icinga open source monitoring
https://twitter.com/dnsmichi || lead core developer
dnsmi...@jabber.ccc.de || https://www.icinga.org/team
irc.freenode.net/icinga || dnsmichi
_______________________________________________
icinga-users mailing list
icinga-users@lists.icinga.org
https://lists.icinga.org/mailman/listinfo/icinga-users