Hi Jake, I erased the files as mentioned und started the services. This is what I get on pilotpound after crm_mon :
============ Last updated: Fri Jul 20 17:45:58 2012 Last change: Current DC: NONE 0 Nodes configured, unknown expected votes 0 Resources configured. ============ Looks like the system didn´t joined the cluster. Any suggestions are welcome Kind regards fatharly ------- Original-Nachricht -------- > Datum: Fri, 20 Jul 2012 10:49:15 -0400 (EDT) > Von: Jake Smith <jsm...@argotec.com> > An: The Pacemaker cluster resource manager <pacemaker@oss.clusterlabs.org> > Betreff: Re: [Pacemaker] problem with pacemaker/corosync on CentOS 6.3 > > ----- Original Message ----- > > From: fatcha...@gmx.de > > To: pacemaker@oss.clusterlabs.org > > Sent: Friday, July 20, 2012 6:08:45 AM > > Subject: [Pacemaker] problem with pacemaker/corosync on CentOS 6.3 > > > > Hi, > > > > I´m using a pacemaker+corosync bundle to run a pound based > > loadbalancer. After an update on CentOS 6.3 there is some mismatch > > of the node status. Via crm_mon on one node eveything looks fine > > while on the other node everything is offline. Everything was fine > > on CentOS 6.2. > > > > Node powerpound: > > > > ============ > > Last updated: Fri Jul 20 12:04:29 2012 > > Last change: Thu Jul 19 17:58:31 2012 via crm_attribute on pilotpound > > Stack: openais > > Current DC: powerpound - partition with quorum > > Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14 > > 2 Nodes configured, 2 expected votes > > 7 Resources configured. > > ============ > > > > Online: [ powerpound pilotpound ] > > > > HA_IP_1 (ocf::heartbeat:IPaddr2): Started powerpound > > HA_IP_2 (ocf::heartbeat:IPaddr2): Started powerpound > > HA_IP_3 (ocf::heartbeat:IPaddr2): Started powerpound > > HA_IP_4 (ocf::heartbeat:IPaddr2): Started powerpound > > HA_IP_5 (ocf::heartbeat:IPaddr2): Started powerpound > > Clone Set: pingclone [ping-gateway] > > Started: [ pilotpound powerpound ] > > > > > > Node pilotpound: > > > > ============ > > Last updated: Fri Jul 20 12:04:32 2012 > > Last change: Thu Jul 19 17:58:17 2012 via crm_attribute on pilotpound > > Stack: openais > > Current DC: NONE > > 2 Nodes configured, 2 expected votes > > 7 Resources configured. > > ============ > > > > OFFLINE: [ powerpound pilotpound ] > > > > > > > > > > > > from /var/log/messages on pilotpound: > > > > Jul 20 12:06:12 pilotpound cib[24755]: warning: cib_peer_callback: > > Discarding cib_apply_diff message (35909) from powerpound: not in > > our mem bership > > Jul 20 12:06:12 pilotpound cib[24755]: warning: cib_peer_callback: > > Discarding cib_apply_diff message (35910) from powerpound: not in > > our mem bership > > > > > > > > how could this happened and what can I do to solve this problem ? > > Pretty sure it had nothing to do with upgrade - I had this the other day > on Ubuntu 12.04 after a reboot of both nodes. I believe a couple experts > called it a "transient" bug. See: > https://bugzilla.redhat.com/show_bug.cgi?id=820821 > https://bugzilla.redhat.com/show_bug.cgi?id=5040 > > > > > Any suggestions are welcome > > I fixed by stopping/killing pacemaker/corosync on offending node > (pilotpound). Then cleared these files out on same node: > rm /var/lib/heartbeat/crm/cib* > rm /var/lib/pengine/* > > Then restart corosync/pacemaker and the node rejoined fine. > > HTH > > Jake > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org