Hi Andrew, I am adding the log messages which i get when I commit the crm configuration and crm_verify -LM output for your consideration. My crm configuration is attached ..It is showing that the resources cannot run anywhere. What should I do??
crm_verify -LV snippet ------------------------------- r...@node1:~# crm_verify -LV crm_verify[10393]: 2010/02/23_11:27:44 WARN: native_color: Resource vir-ip cannot run anywhere crm_verify[10393]: 2010/02/23_11:27:44 WARN: native_color: Resource slony-fail cannot run anywhere crm_verify[10393]: 2010/02/23_11:27:44 WARN: native_color: Resource slony-fail2 cannot run anywhere Warnings found during check: config may not be valid r...@node1:~# crm_verify -LV crm_verify[10760]: 2010/02/23_11:32:50 WARN: native_color: Resource vir-ip cannot run anywhere crm_verify[10760]: 2010/02/23_11:32:50 WARN: native_color: Resource slony-fail cannot run anywhere crm_verify[10760]: 2010/02/23_11:32:50 WARN: native_color: Resource slony-fail2 cannot run anywhere Warnings found during check: config may not be valid -------------------------------------------------------------- Log snippet ------------------------------------------------- Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - <cib admin_epoch="0" epoch="285" num_updates="33" > Feb 23 11:25:48 node1 crmd: [1629]: info: abort_transition_graph: need_abort:59 - Triggered transition abort (complete=1) : Non-status change Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - <configuration > Feb 23 11:25:48 node1 crmd: [1629]: info: need_abort: Aborting on change to admin_epoch Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - <constraints > Feb 23 11:25:48 node1 crmd: [1629]: info: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - <rsc_location id="vir-ip-with-pingd" > Feb 23 11:25:48 node1 crmd: [1629]: info: do_state_transition: All 2 cluster nodes are eligible to run resources. Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - <rule score="-1000" id="vir-ip-with-pingd-rule" /> Feb 23 11:25:48 node1 crmd: [1629]: info: do_pe_invoke: Query 187: Requesting the current CIB: S_POLICY_ENGINE Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - </rsc_location> Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - </constraints> Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - </configuration> Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: - </cib> Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + <cib admin_epoch="0" epoch="286" num_updates="1" > Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + <configuration > Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + <constraints > Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + <rsc_location id="vir-ip-with-pingd" > Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + <rule score="-INFINITY" id="vir-ip-with-pingd-rule" /> Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + </rsc_location> Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + </constraints> Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + </configuration> Feb 23 11:25:48 node1 cib: [1625]: info: log_data_element: cib:diff: + </cib> Feb 23 11:25:48 node1 cib: [1625]: info: cib_process_request: Operation complete: op cib_replace for section constraints (origin=local/cibadmin/2, version=0.286.1): ok (rc=0) Feb 23 11:25:48 node1 crmd: [1629]: info: do_pe_invoke_callback: Invoking the PE: ref=pe_calc-dc-1266904548-176, seq=12, quorate=1 Feb 23 11:25:48 node1 pengine: [6277]: notice: unpack_config: On loss of CCM Quorum: Ignore Feb 23 11:25:48 node1 pengine: [6277]: info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 Feb 23 11:25:48 node1 pengine: [6277]: info: determine_online_status: Node node2 is online Feb 23 11:25:48 node1 pengine: [6277]: info: determine_online_status: Node node1 is online Feb 23 11:25:48 node1 pengine: [6277]: info: unpack_rsc_op: slony-fail2_monitor_0 on node1 returned 0 (ok) instead of the expected value: 7 (not running) Feb 23 11:25:48 node1 pengine: [6277]: notice: unpack_rsc_op: Operation slony-fail2_monitor_0 found resource slony-fail2 active on node1 Feb 23 11:25:48 node1 pengine: [6277]: info: unpack_rsc_op: pgsql:1_monitor_0 on node1 returned 0 (ok) instead of the expected value: 7 (not running) Feb 23 11:25:48 node1 pengine: [6277]: notice: unpack_rsc_op: Operation pgsql:1_monitor_0 found resource pgsql:1 active on node1 Feb 23 11:25:48 node1 pengine: [6277]: notice: native_print: vir-ip (ocf::heartbeat:IPaddr2): Started node1 Feb 23 11:25:48 node1 pengine: [6277]: notice: native_print: slony-fail (lsb:slony_failover): Started node1 Feb 23 11:25:48 node1 pengine: [6277]: notice: clone_print: Clone Set: pgclone Feb 23 11:25:48 node1 pengine: [6277]: notice: print_list: Started: [ node2 node1 ] Feb 23 11:25:48 node1 pengine: [6277]: notice: native_print: slony-fail2 (lsb:slony_failover2): Started node1 Feb 23 11:25:48 node1 pengine: [6277]: notice: clone_print: Clone Set: pingclone Feb 23 11:25:48 node1 pengine: [6277]: notice: print_list: Started: [ node2 node1 ] Feb 23 11:25:48 node1 pengine: [6277]: info: native_merge_weights: vir-ip: Rolling back scores from slony-fail Feb 23 11:25:48 node1 pengine: [6277]: info: native_merge_weights: vir-ip: Rolling back scores from slony-fail2 Feb 23 11:25:48 node1 pengine: [6277]: WARN: native_color: Resource vir-ip cannot run anywhere Feb 23 11:25:48 node1 pengine: [6277]: WARN: native_color: Resource slony-fail cannot run anywhere Feb 23 11:25:48 node1 pengine: [6277]: WARN: native_color: Resource slony-fail2 cannot run anywhere Feb 23 11:25:48 node1 pengine: [6277]: notice: LogActions: Stop resource vir-ip(node1) Feb 23 11:25:48 node1 pengine: [6277]: notice: LogActions: Stop resource slony-fail (node1) Feb 23 11:25:48 node1 pengine: [6277]: notice: LogActions: Leave resource pgsql:0 (Started node2) Feb 23 11:25:48 node1 pengine: [6277]: notice: LogActions: Leave resource pgsql:1 (Started node1) Feb 23 11:25:48 node1 pengine: [6277]: notice: LogActions: Stop resource slony-fail2 (node1) Feb 23 11:25:48 node1 pengine: [6277]: notice: LogActions: Leave resource pingd:0 (Started node2) Feb 23 11:25:48 node1 pengine: [6277]: notice: LogActions: Leave resource pingd:1 (Started node1) Feb 23 11:25:48 node1 lrmd: [1626]: info: rsc:slony-fail:41: stop Feb 23 11:25:48 node1 cib: [10242]: info: write_cib_contents: Archived previous version as /var/lib/heartbeat/crm/cib-8.raw Feb 23 11:25:48 node1 crmd: [1629]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Feb 23 11:25:48 node1 lrmd: [1626]: info: rsc:slony-fail2:42: stop Feb 23 11:25:48 node1 crmd: [1629]: info: unpack_graph: Unpacked transition 22: 4 actions in 4 synapses Feb 23 11:25:48 node1 crmd: [1629]: info: do_te_invoke: Processing graph 22 (ref=pe_calc-dc-1266904548-176) derived from /var/lib/pengine/pe-warn-101.bz2 Feb 23 11:25:48 node1 crmd: [1629]: info: te_rsc_command: Initiating action 11: stop slony-fail_stop_0 on node1 (local) Feb 23 11:25:48 node1 crmd: [1629]: info: do_lrm_rsc_op: Performing key=11:22:0:fd31c6bc-df43-4481-8b69-2c54c50075fb op=slony-fail_stop_0 ) Feb 23 11:25:48 node1 crmd: [1629]: info: te_rsc_command: Initiating action 28: stop slony-fail2_stop_0 on node1 (local) Feb 23 11:25:48 node1 crmd: [1629]: info: do_lrm_rsc_op: Performing key=28:22:0:fd31c6bc-df43-4481-8b69-2c54c50075fb op=slony-fail2_stop_0 ) Feb 23 11:25:48 node1 lrmd: [10244]: WARN: For LSB init script, no additional parameters are needed. Feb 23 11:25:48 node1 lrmd: [10243]: WARN: For LSB init script, no additional parameters are needed. Feb 23 11:25:48 node1 crmd: [1629]: info: process_lrm_event: LRM operation slony-fail_stop_0 (call=41, rc=0, cib-update=188, confirmed=true) complete ok Feb 23 11:25:48 node1 cib: [10242]: info: write_cib_contents: Wrote version 0.286.0 of the CIB to disk (digest: aaddbe7aeaf08365be5bbbdb4931295e) Feb 23 11:25:48 node1 crmd: [1629]: info: match_graph_event: Action slony-fail_stop_0 (11) confirmed on node1 (rc=0) Feb 23 11:25:48 node1 pengine: [6277]: WARN: process_pe_message: Transition 22: WARNINGs found during PE processing. PEngine Input stored in: /var/lib/pengine/pe-warn-101.bz2 Feb 23 11:25:48 node1 pengine: [6277]: info: process_pe_message: Configuration WARNINGs found during PE processing. Please run "crm_verify -L" to identify issues. Feb 23 11:25:48 node1 crmd: [1629]: info: process_lrm_event: LRM operation slony-fail2_stop_0 (call=42, rc=0, cib-update=189, confirmed=true) complete ok Feb 23 11:25:48 node1 lrmd: [1626]: info: rsc:vir-ip:43: stop Feb 23 11:25:48 node1 crmd: [1629]: info: match_graph_event: Action slony-fail2_stop_0 (28) confirmed on node1 (rc=0) Feb 23 11:25:48 node1 crmd: [1629]: info: te_rsc_command: Initiating action 10: stop vir-ip_stop_0 on node1 (local) Feb 23 11:25:48 node1 crmd: [1629]: info: do_lrm_rsc_op: Performing key=10:22:0:fd31c6bc-df43-4481-8b69-2c54c50075fb op=vir-ip_stop_0 ) Feb 23 11:25:48 node1 crmd: [1629]: info: process_lrm_event: LRM operation vir-ip_monitor_15000 (call=31, rc=-2, cib-update=0, confirmed=true) Cancelled unknown exec error Feb 23 11:25:48 node1 cib: [10242]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.gwOpFZ (digest: /var/lib/heartbeat/crm/cib.UtFyLu) Feb 23 11:25:48 node1 IPaddr2[10249]: [10285]: INFO: ip -f inet addr delete 192.168.10.10/24 dev eth0 Feb 23 11:25:48 node1 IPaddr2[10249]: [10287]: INFO: ip -o -f inet addr show eth0 Feb 23 11:25:48 node1 crmd: [1629]: info: process_lrm_event: LRM operation vir-ip_stop_0 (call=43, rc=0, cib-update=190, confirmed=true) complete ok Feb 23 11:25:48 node1 crmd: [1629]: info: match_graph_event: Action vir-ip_stop_0 (10) confirmed on node1 (rc=0) Feb 23 11:25:48 node1 crmd: [1629]: info: te_pseudo_action: Pseudo action 6 fired and confirmed Feb 23 11:25:48 node1 crmd: [1629]: info: run_graph: ==================================================== Feb 23 11:25:48 node1 crmd: [1629]: notice: run_graph: Transition 22 (Complete=4, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pengine/pe-warn-101.bz2): Complete Feb 23 11:25:48 node1 crmd: [1629]: info: te_graph_trigger: Transition 22 is now complete Feb 23 11:25:48 node1 crmd: [1629]: info: notify_crmd: Transition 22 status: done - <null> Feb 23 11:25:48 node1 crmd: [1629]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] Feb 23 11:25:48 node1 crmd: [1629]: info: do_state_transition: Starting PEngine Recheck Timer Feb 23 11:27:11 node1 cib: [1625]: info: cib_stats: Processed 88 operations (8295.00us average, 0% utilization) in the last 10min ------------------------------------------------------------ crm_mon snippet ------------------------------------------ ============ Last updated: Tue Feb 23 11:27:56 2010 Stack: Heartbeat Current DC: node1 (ac87f697-5b44-4720-a8af-12a6f2295930) - partition with quorum Version: 1.0.5-3840e6b5a305ccb803d29b468556739e75532d56 2 Nodes configured, unknown expected votes 5 Resources configured. ============ Online: [ node2 node1 ] Clone Set: pgclone Started: [ node2 node1 ] Clone Set: pingclone Started: [ node2 node1 ] ------------------------------------------------------------- On Tue, Feb 23, 2010 at 9:38 AM, Jayakrishnan <jayakrishnan...@gmail.com>wrote: > Sir, > I am afraid to ask you but how can I tell pacemaker to compare as number > instead of string. > I changed -inf: to -10000 in pingd location constarint but same problem > persists. > I also changer the global resource stickness to 10000. but still not > working. > > With thanks, > Jayakrishnan.L > > > On Tue, Feb 23, 2010 at 1:04 AM, Andrew Beekhof <and...@beekhof.net>wrote: > >> On Mon, Feb 22, 2010 at 6:46 PM, Jayakrishnan <jayakrishnan...@gmail.com> >> wrote: >> > Sir, >> > I have setup a 2 node cluster with heartbeat 2.99 pacemaker 1.05. I am >> > using Ubuntu 9.1. Both the packages are installed from ubuntu karmic >> > repository. >> > My packages are: >> > >> > heartbeat 2.99.2+sles11r9-5ubuntu1 >> > heartbeat-common 2.99.2+sles11r9-5ubuntu1 >> > heartbeat-common-dev 2.99.2+sles11r9-5ubuntu1 >> > heartbeat-dev 2.99.2+sles11r9-5ubuntu1 >> > libheartbeat2 2.99.2+sles11r9-5ubuntu1 >> > libheartbeat2-dev 2.99.2+sles11r9-5ubuntu1 >> > pacemaker-heartbeat 1.0.5+hg20090813-0ubuntu4 >> > pacemaker-heartbeat-dev 1.0.5+hg20090813-0ubuntu4 >> > >> > My ha.cf file, crm configuration are all attached in the mail. >> > >> > I am making a postgres database cluster with slony replication. eth1 is >> my >> > heartbeat link, a cross over cable is connected between the servers in >> eth1. >> > eth0 is my external network where my cluster IP get assigned. >> > server1--> hostname node1 >> > node 1 192.168.10.129 eth1 >> > 192.168.1.1-->eth0 >> > >> > >> > servver2 --> hostname node2 >> > node2 192.168.10.130 eth1 >> > 192.168.1.2 --> eth0 >> > >> > Now when I pull out my eth1 cable, I need to make a failover to the >> other >> > node. For that i have configured pingd as follows. But it is not >> working. My >> > resources are not at all starting when I give rule as >> > rule -inf: not_defined pingd or pingd lte0 >> >> You need to get 1.0.7 or tell pacemaker to do the comparison as a >> number instead of as a string. >> >> > >> > I tried changing the -inf: to inf: then the resources got started but >> > resource failover is not taking place when i pull out the eth1 cable. >> > >> > Please check my configuration and kindly point out where I am missing. >> > PLease see that I am using default resource stickness as INFINITY which >> is >> > compulsory for slony replication. >> > >> > MY ha.cf file >> > ------------------------------------------------------------------ >> > >> > autojoin none >> > keepalive 2 >> > deadtime 15 >> > warntime 10 >> > initdead 64 >> > initdead 64 >> > bcast eth1 >> > auto_failback off >> > node node1 >> > node node2 >> > crm respawn >> > use_logd yes >> > ____________________________________________ >> > >> > My crm configuration >> > >> > node $id="3952b93e-786c-47d4-8c2f-a882e3d3d105" node2 \ >> > attributes standby="off" >> > node $id="ac87f697-5b44-4720-a8af-12a6f2295930" node1 \ >> > attributes standby="off" >> > primitive pgsql lsb:postgresql-8.4 \ >> > meta target-role="Started" resource-stickness="inherited" \ >> > op monitor interval="15s" timeout="25s" on-fail="standby" >> > primitive pingd ocf:pacemaker:pingd \ >> > params name="pingd" hostlist="192.168.10.1 192.168.10.75" \ >> > op monitor interval="15s" timeout="5s" >> > primitive slony-fail lsb:slony_failover \ >> > meta target-role="Started" >> > primitive slony-fail2 lsb:slony_failover2 \ >> > meta target-role="Started" >> > primitive vir-ip ocf:heartbeat:IPaddr2 \ >> > params ip="192.168.10.10" nic="eth0" cidr_netmask="24" >> > broadcast="192.168.10.255" \ >> > op monitor interval="15s" timeout="25s" on-fail="standby" \ >> > meta target-role="Started" >> > clone pgclone pgsql \ >> > meta notify="true" globally-unique="false" interleave="true" >> > target-role="Started" >> > clone pingclone pingd \ >> > meta globally-unique="false" clone-max="2" clone-node-max="1" >> > location vir-ip-with-pingd vir-ip \ >> > rule $id="vir-ip-with-pingd-rule" inf: not_defined pingd or >> pingd >> > lte 0 >> > meta globally-unique="false" clone-max="2" clone-node-max="1" >> > colocation ip-with-slony inf: slony-fail vir-ip >> > colocation ip-with-slony2 inf: slony-fail2 vir-ip >> > order ip-b4-slony2 inf: vir-ip slony-fail2 >> > order slony-b4-ip inf: vir-ip slony-fail >> > property $id="cib-bootstrap-options" \ >> > dc-version="1.0.5-3840e6b5a305ccb803d29b468556739e75532d56" \ >> > cluster-infrastructure="Heartbeat" \ >> > no-quorum-policy="ignore" \ >> > stonith-enabled="false" \ >> > last-lrm-refresh="1266851027" >> > rsc_defaults $id="rsc-options" \ >> > resource-stickiness="INFINITY" >> > >> > _____________________________________ >> > >> > My crm status: >> > __________________________ >> > >> > crm(live)# status >> > >> > >> > ============ >> > Last updated: Mon Feb 22 23:15:56 2010 >> > Stack: Heartbeat >> > Current DC: node2 (3952b93e-786c-47d4-8c2f-a882e3d3d105) - partition >> with >> > quorum >> > Version: 1.0.5-3840e6b5a305ccb803d29b468556739e75532d56 >> > 2 Nodes configured, unknown expected votes >> > 5 Resources configured. >> > ============ >> > >> > Online: [ node2 node1 ] >> > >> > Clone Set: pgclone >> > Started: [ node1 node2 ] >> > Clone Set: pingclone >> > Started: [ node2 node1 ] >> > >> > ============================ >> > >> > please help me out. >> > -- >> > > -- Regards, Jayakrishnan. L Visit: www.jayakrishnan.bravehost.com
node $id="3952b93e-786c-47d4-8c2f-a882e3d3d105" node2 \ attributes standby="off" node $id="ac87f697-5b44-4720-a8af-12a6f2295930" node1 \ attributes standby="off" primitive pgsql lsb:postgresql-8.4 \ meta target-role="Started" resource-stickness="inherited" \ op monitor interval="15s" timeout="25s" on-fail="standby" primitive pingd ocf:pacemaker:pingd \ params name="pingd" multiplier="100" hostlist="192.168.10.1 192.168.10.69" \ op monitor interval="15s" timeout="5s" primitive slony-fail lsb:slony_failover \ meta target-role="Started" primitive slony-fail2 lsb:slony_failover2 \ meta target-role="Started" primitive vir-ip ocf:heartbeat:IPaddr2 \ params ip="192.168.10.10" nic="eth0" cidr_netmask="24" broadcast="192.168.10.255" \ op monitor interval="15s" timeout="25s" on-fail="standby" \ meta target-role="Started" clone pgclone pgsql \ meta notify="true" globally-unique="false" interleave="true" target-role="Started" clone pingclone pingd \ meta globally-unique="false" clone-max="2" clone-node-max="1" location vir-ip-with-pingd vir-ip \ rule $id="vir-ip-with-pingd-rule" -inf: not_defined pingd or pingd lte 0 colocation ip-with-slony inf: slony-fail vir-ip colocation ip-with-slony2 inf: slony-fail2 vir-ip order ip-b4-slony2 inf: vir-ip slony-fail2 order slony-b4-ip inf: vir-ip slony-fail property $id="cib-bootstrap-options" \ dc-version="1.0.5-3840e6b5a305ccb803d29b468556739e75532d56" \ cluster-infrastructure="Heartbeat" \ no-quorum-policy="ignore" \ stonith-enabled="false" \ last-lrm-refresh="1266851027" rsc_defaults $id="rsc-options" \ resource-stickiness="INFINITY"
ha.cf
Description: Binary data
_______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker