Dear List, We are using Pacemaker and Corosync with CMAN as our HA software as below version.
OS: CentOS release 6.5 (Final) 64-bit Pacemaker: pacemaker.x86_64 1.1.10-14.el6_5.3 Corosync: corosync.x86_64 1.4.1-17.el6_5.1 CMAN: cman.x86_64 3.0.12.1-59.el6_5.2 Resource-Agent: resource-agents.x86_64 3.9.5-3.12 Topology: 2 Nodes with Active/Standby model. (MySQL is Active/Active by clone) All packages are install from CentOS official repository, and the Resource-Agent is only one which be installed from OpenSUSE repository (http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/). The system is work normally for few months until yesterday morning, around 03:35 UTC+0700, we found that one of resource is go into UNMANAGED state without any configuration changed. After another resource is failed, the pacemaker try to failed-over resource to another node but it incomplete after facing this resource. Configuration of some resource is below and the LOG during event is in attached file. primitive res.vBKN6 IPv6addr \ params ipv6addr="2001:db8:0:f::61a" cidr_netmask=64 nic=eth0 \ op monitor interval=10s primitive res.vDMZ6 IPv6addr \ params ipv6addr="2001:db8:0:9::61a" cidr_netmask=64 nic=eth1 \ op monitor interval=10s group gr.mainService res.vDMZ4 res.vDMZ6 res.vBKN4 res.vBKN6 res.http res.ftp rsc_defaults rsc_defaults-options: \ migration-threshold=1 Please help me to solve this problem. --teenigma
corosync.log
Description: Binary data
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org