Hi Chris, for me, the main question is what you want to monitor on the HA cluster. I always follow a differentiated approach:
There are two types of services: Load balanced services that normally run on all cluster nodes, and services that fail over to another node if the current node fails. In the first case, you need to monitor the service on the individual nodes so you notice when one instance of the service goes down so you can take measures against single service failure that don't manifest itself in the cluster service because of resilience. For the cluster service, you can then either define an additional host using the cluster VIP and query remotely, or use the Business Service plugin for Icinga Web 2 to derive the state of the combined services from the local services. In the second case it's a bit more involved as you don't know which cluster node the service is supposed to run on. For pacemaker clusters on RHEL I usually resort to the cluster-snmp package, which provides an SNMP sub-agent that gives access to the cluster state, and which I can query remotely from a satellite zone or the master zone. It's pretty easy to write some SNMP queries that give a good overview of the overall state of the cluster and the services running on it. This approach has one minor drawback: For maximum monitoring resiliency you need to run the service check against all cluster hosts, which means that if something fails you see multiple identical alerts. That can be solved by using keepalived and a VIP on the cluster hosts and have one of the SNMP daemons answer all the queries. Peter. _______________________________________________ icinga-users mailing list icinga-users@lists.icinga.org https://lists.icinga.org/mailman/listinfo/icinga-users