Am Mittwoch, den 01.09.2010, 16:18 +0800 schrieb Alister Wong: > Hi, Michael. Thanks for reply. > > Actually, I want to setup to pass resource once any error occurred. > Can the node always pass the resource to another node automatically even the > node was encounter error before? > For example: > > At the beginning all my resource is located at NodeA, then NodeA encounter > error, it failover to NodeB. From crm_mon it should Fail Action in NodeA. > When NodeB encounter error, it will pass the resource back to NodeA. > > I am not sure any configure can be set to achieve this. > > Or any command I can run to make NodeA to be ready to receive resource > again? > Currently, I know that I can't pass the resource to node that was failed > before. > > Thank you. > > Alister > -----Original Message----- > From: Michael Schwartzkopff [mailto:mi...@clusterbau.com] > Sent: Tuesday, August 31, 2010 9:53 PM > To: The Pacemaker cluster resource manager > Subject: Re: [Pacemaker] Make 2 nodes failover to each other > > Am Dienstag, den 31.08.2010, 21:24 +0800 schrieb Alister Wong: > > Hi, > > > > > > > > I am new to Linux cluster, I have a question for 2 nodes cluster. > > > > I want to make cluster with jakarta tomcat, the node will failover to > > each other if error detected (e.g. gateway failed to ping) > > > > However, in my current setting, once the node (A) is encountered > > error, it will failover to another (B). Then if B encounter failed, it > > can't fail back to A. > > > > Can anyone help me to let the resource failover around once it > > encountered error? > > > > Do I have to do something to make a failed node to be ready to use > > again? If it is, can anyone tell me how? > > > > > > > > Below is my configure: > > > > [r...@nmc01-a ~]# crm configure show > > > > node nmc01-a > > > > node nmc01-b > > > > primitive ClusterIP ocf:heartbeat:IPaddr2 \ > > > > params ip="10.214.65.5" cidr_netmask="24" \ > > > > op monitor interval="30s" > > > > primitive Tomcat ocf:heartbeat:tomcat \ > > > > operations $id="Tomcat-operations" \ > > > > op monitor interval="30" timeout="30" \ > > > > op start interval="0" timeout="70" \ > > > > op stop interval="0" timeout="120" \ > > > > params catalina_home="/opt/apache-tomcat-6.0.26" > > java_home="/usr/java/jdk1.6.0_21" tomcat_user="nmc" \ > > > > meta target-role="Started" > > > > primitive pingd ocf:pacemaker:pingd \ > > > > params host_list="10.214.65.254" multiplier="100" \ > > > > op monitor interval="60s" timeout="50s" on_fail="restart" \ > > > > op start interval="0" timeout="90" \ > > > > op stop interval="0" timeout="100" > > > > group nmc_web ClusterIP Tomcat > > > > clone pingdclone pingd \ > > > > meta globally-unique="false" > > > > location nmc_web_connected_node nmc_web \ > > > > rule $id="nmc_web_connected_node-rule" -inf: pingd lte 0 > > > > property $id="cib-bootstrap-options" \ > > > > dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677" \ > > > > cluster-infrastructure="openais" \ > > > > expected-quorum-votes="2" \ > > > > stonith-enabled="false" \ > > > > no-quorum-policy="ignore" \ > > > > last-lrm-refresh="1283167810" > > > > rsc_defaults $id="rsc-options" \ > > > > resource-stickiness="100" > > > > > > > > By the way, from pingd example in clusterlab.org. > > > > What does "not_defined pingd" mean in below rule setting? > > > > crm(pingd)configure# location my_web_cluster_on_connected_node > > my_web_cluster \ > > > > rule -inf: not_defined pingd or pingd lte 0 > > > You have a resource stickiness defined. So after the first node becomes > available again the resource stays there where it runs and does not fail > back. > > > > > When I included "not_defined pingd" in my cluster confgiure, if one of > > the node hasn't started up. The pingd in that node won't be started > > and caused > > > > my other resources (Virtual IP and tomcat) couldn't start up. > > Perhaps a syntay error because you forget the "or"? > > This "defined" also checks if the pingd attribute is defined at all on a > node. So it prevents resources running on a node with the ping resource > not running. > > Michael. > > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > __________ Information from ESET NOD32 Antivirus, version of virus signature > database 5413 (20100831) __________ > > The message was checked by ESET NOD32 Antivirus. > > http://www.eset.com > > > > > __________ Information from ESET NOD32 Antivirus, version of virus signature > database 5413 (20100831) __________ > > The message was checked by ESET NOD32 Antivirus. > > http://www.eset.com > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
First of all: http://www.caliburn.nl/topposting.html You can acchieve this behaviour setting the migration-thrshold to 1. A failure on node A would stop the resource on node A and start it on node B. You would have to clear the fail-counter in node A manually. The resource stickiness makes the resource stay on node B until this node is not capable to run the reosurce. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker