On Fri, Nov 23, 2012 at 11:47 PM, Parshvi <parshvi...@gmail.com> wrote: > Hi, > We are upgrading to Pacemaker 1.1.7 and Corosync 1.4.3. > The previous version was: > Pacemaker: 1.0.12 > Corosync : 1.2.7 > The issues faced in the older version are: > 1) Numerous, Policy engine and crmd crashes, stopping failed cluster resources > from recovering.
Did you report any of these? I can't fix bugs I don't know about. > 2) pacemaker logs show FSM in pending state, service comes in sync only after > a > restart. As above. > > Environment: > 1) OS: OEL 5.8 > RPMS(packages) for Pacemaker 1.1.7, Corosync 1.4.3 and other dependent pkgs > are > not available for OEL 5.8. Hence, we have build all pkgs from source (github). Did you try the ones at: http://clusterlabs.org/rpm-next/ > > We have a two node cluster. We have installed the build binaries on both > cluster > nodes. crm_mon shows both nodes as online. All processes of corosync and > pacemaker appear started and running. > > Issues faced: > We have another setup, consisting of two nodes in the cluster(same as above). > Pkg binaries have been installed on both the nodes. > One of the nodes appears UNCLEAN (offline) and other node appears (offline). > crmd process continuously respawns until its max respawn count is reached. DC > appears NONE in crm_mon. > > I have checked selinux, firewall on the nodes(its disabled). > > I have an hb_report of the nodes. I can share it if needed. Yes please. Not much we can do without it. Or at least without some sort of description beyond "the crmd respawns". > I also created another cluster of 2 nodes: One node was from WORKING cluster > and > another node was from NON_WORKING cluster. > A dump of the o/p of crm_mon of such a cluster is: > > Last updated: Sat Nov 17 19:53:37 2012 > Last change: Sat Nov 17 19:53:27 2012 via crmd on node-112 > Stack: openais > Current DC: node-112 - partition with quorum > Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff > 2 Nodes configured, 2 expected votes > 0 Resources configured. > ============ > > Node node-122: UNCLEAN (offline) > Online: [ node-112 ] > > > After some time the UNCLEAN(offline) node appears offline: > > Last updated: Sat Nov 17 20:26:48 2012 > Last change: Sat Nov 17 20:15:38 2012 via cibadmin on node-112 > Stack: openais > Current DC: node-112 - partition with quorum > Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff > 2 Nodes configured, 2 expected votes > 0 Resources configured. > ============ > > Online: [ node-112 ] > OFFLINE: [ node-122 ] > > I would request the owners to please respond with some input. The old version > is > a concern at our production. > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org