Thanks Andrew for your input. Andrew Beekhof <andrew@...> writes: > > On Fri, Nov 23, 2012 at 11:47 PM, Parshvi <parshvi.17@...> wrote: > > Hi, > > We are upgrading to Pacemaker 1.1.7 and Corosync 1.4.3. > > The previous version was: > > Pacemaker: 1.0.12 > > Corosync : 1.2.7 > > The issues faced in the older version are: > > 1) Numerous, Policy engine and crmd crashes, stopping failed cluster resources > > from recovering. > > Did you report any of these? > I can't fix bugs I don't know about. I have raised the issue on the forum mails. Haven't opened a bug though on bugzilla. I would file a bug for the issue now. > > > 2) pacemaker logs show FSM in pending state, service comes in sync only after a > > restart. > > As above. Raised the issue on forum. Will file a bug now. > > > > > Environment: > > 1) OS: OEL 5.8 > > RPMS(packages) for Pacemaker 1.1.7, Corosync 1.4.3 and other dependent pkgs are > > not available for OEL 5.8. Hence, we have build all pkgs from source (github). > > Did you try the ones at: http://clusterlabs.org/rpm-next/ Yes, while working on issue I went to clusterlabs.org for help. I have worked with the rpms-next for pacemaker 1.1.8 and corosync 1.4.1. The nodes come ONLINE, as expected. I am using the old resource-agents version: 1.0.4 (I didn't find the rpms for latest version on clusterlabs. Can u suggest as to where I can find the rpms for latest rel. of resource-agents ?) According to http://upstream- tracker.org/changelogs/pacemaker/1.1.8/changelog.html crm has become a separate project. Hence I would be installing the crm/cli now.
> > > > > We have a two node cluster. We have installed the build binaries on both cluster > > nodes. crm_mon shows both nodes as online. All processes of corosync and > > pacemaker appear started and running. > > > > Issues faced: > > We have another setup, consisting of two nodes in the cluster(same as above). > > Pkg binaries have been installed on both the nodes. > > One of the nodes appears UNCLEAN (offline) and other node appears (offline). > > crmd process continuously respawns until its max respawn count is reached. DC > > appears NONE in crm_mon. > > > > I have checked selinux, firewall on the nodes(its disabled). > > > > I have an hb_report of the nodes. I can share it if needed. > > Yes please. Not much we can do without it. Or at least without some > sort of description beyond "the crmd respawns". Will share the hb_report. > > > I also created another cluster of 2 nodes: One node was from WORKING > > cluster and > > another node was from NON_WORKING cluster. > > A dump of the o/p of crm_mon of such a cluster is: > > > > Last updated: Sat Nov 17 19:53:37 2012 > > Last change: Sat Nov 17 19:53:27 2012 via crmd on node-112 > > Stack: openais > > Current DC: node-112 - partition with quorum > > Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff > > 2 Nodes configured, 2 expected votes > > 0 Resources configured. > > ============ > > > > Node node-122: UNCLEAN (offline) > > Online: [ node-112 ] > > > > > > After some time the UNCLEAN(offline) node appears offline: > > > > Last updated: Sat Nov 17 20:26:48 2012 > > Last change: Sat Nov 17 20:15:38 2012 via cibadmin on node-112 > > Stack: openais > > Current DC: node-112 - partition with quorum > > Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff > > 2 Nodes configured, 2 expected votes > > 0 Resources configured. > > ============ > > > > Online: [ node-112 ] > > OFFLINE: [ node-122 ] > > > > I would request the owners to please respond with some input. The old version is > > a concern at our production. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org