Hi, On Fri, Aug 03, 2012 at 04:37:55PM +0200, Tobias Brunner wrote: > Hi list, > > Thanks for the input so far, here are new findings. > > > > meta master-max="1" master-node-max="1" clone-max="2" > > > clone-node-max="1" notify="true" target-role="Master"> > > > location location-groupMysql-on-node1 groupMysql inf: halab3 > > > > So you have a "mandatory" location constraint saying > > run this thing only on halab3 > > > > You're right, that's not what I want. > > > Remove the inf: halab3, or replace it with some not infinite score. > > Ok, that's done! Now here is a "crm configure show" from another cluster on > which "crm resource move groupApache nodeha2" doesn't work (same > configuration > as halab3): > > node nodeha1 > node nodeha2 > primitive resApache ocf:heartbeat:apache \ > params configfile="/etc/apache2/apache2.conf" > statusurl="http://localhost/server-status" \ > op monitor interval="1min" \ > op start interval="0" timeout="40" \ > op stop interval="0" timeout="60" > primitive resDRBDApache ocf:linbit:drbd \ > params drbd_resource="www-data" \ > op start interval="0" timeout="240" \ > op stop interval="0" timeout="100" > primitive resDRBDPostgresql ocf:linbit:drbd \ > params drbd_resource="postgresql" \ > op start interval="0" timeout="240" \ > op stop interval="0" timeout="100" > primitive resFsApache ocf:heartbeat:Filesystem \ > params device="/dev/drbd/by-res/www-data" directory="/home/www-data" > fstype="ext4" \ > op start interval="0" timeout="60" \ > op stop interval="0" timeout="60" > primitive resFsPostgresql ocf:heartbeat:Filesystem \ > params device="/dev/drbd/by-res/postgresql" > directory="/var/lib/postgresql" fstype="ext4" \ > op start interval="0" timeout="60" \ > op stop interval="0" timeout="60" > primitive resIPApache ocf:heartbeat:IPaddr2 \ > params ip="178.209.1.10" nic="eth0" cidr_netmask="28" \ > op monitor interval="30s" > primitive resIPPostgresql ocf:heartbeat:IPaddr2 \ > params ip="178.209.1.11" nic="eth0" cidr_netmask="28" \ > op monitor interval="30s" > primitive resPostgresql ocf:heartbeat:pgsql \ > params pgctl="/usr/lib/postgresql/8.4/bin/pg_ctl" > psql="/usr/lib/postgresql/8.4/bin/psql" pgdata="/var/lib/postgresql/8.4/main" > pghost="178.209.1.11" config="/etc/postgresql/8.4/main/postgresql.conf" > logfile="/var/log/postgresql/postgresql-8.4-main.log" pgdb="template1" > monitor_user="monitor" monitor_password="123" \ > op monitor interval="30" timeout="30" depth="0" \ > op start interval="0" timeout="120" \ > op stop interval="0" timeout="120" > group groupApache resFsApache resIPApache resApache > group groupPostgresql resFsPostgresql resIPPostgresql resPostgresql > ms msResDRBDApache resDRBDApache \ > meta master-max="1" master-node-max="1" clone-max="2" clone-node- > max="1" notify="true" target-role="Master" > ms msResDRBDPostgresql resDRBDPostgresql \ > meta master-max="1" master-node-max="1" clone-max="2" clone-node- > max="1" notify="true" target-role="Master" > location location-groupApache-on-node1 groupApache 50: nodeha1 > location location-groupPostgresql-on-node1 groupPostgresql 50: nodeha1 > colocation colo-groupApache-msResDRBDApache inf: groupApache > msResDRBDApache:Master > colocation colo-groupPostgresql-msResDRBDPostgresql inf: groupPostgresql > msResDRBDPostgresql:Master > order orderGroupApache-after-msResDRBDApache inf: msResDRBDApache:promote > groupApache:start > order orderGroupPostgresql-after-msResDRBDPostgresql inf: > msResDRBDPostgresql:promote groupPostgresql:start > property $id="cib-bootstrap-options" \ > dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > no-quorum-policy="ignore" \ > stonith-enabled="false" \ > last-lrm-refresh="1343987736" > rsc_defaults $id="rsc-options" \ > resource-stickiness="100" > > > Before "crm resource move groupApache nodeha2": > ./showscores.sh > > > Resource Score Node Stickiness #Fail > Migration-Threshold > resApache 100 clientisha1 100 0 > > resApache -INFINITY clientisha2 100 0 > > resDRBDApache:0 0 clientisha2 100 0 > > resDRBDApache:0 10100 clientisha1 100 0 > > resDRBDApache:0_(master) 10700 clientisha1 100 0 > > resDRBDApache:1 100 clientisha2 100 0 > > resDRBDApache:1 -INFINITY clientisha1 100 0 > > resDRBDApache:1_(master) -1 clientisha2 100 0 > > resDRBDPostgresql:0 0 clientisha2 100 0 > > resDRBDPostgresql:0 10100 clientisha1 100 0 > > resDRBDPostgresql:0_(master) 10700 clientisha1 100 0 > > resDRBDPostgresql:1 100 clientisha2 100 0 > > resDRBDPostgresql:1 -INFINITY clientisha1 100 0 > > resDRBDPostgresql:1_(master) -1 clientisha2 100 0 > > resFsApache 10450 clientisha1 100 0 > > resFsApache -INFINITY clientisha2 100 0 > > resFsPostgresql 10450 clientisha1 100 0 > > resFsPostgresql -INFINITY clientisha2 100 0 > > resIPApache 200 clientisha1 100 0 > > resIPApache -INFINITY clientisha2 100 0 > > resIPPostgresql 200 clientisha1 100 0 > > resIPPostgresql -INFINITY clientisha2 100 0 > > resPostgresql 100 clientisha1 100 0 > > resPostgresql -INFINITY clientisha2 100 0
abs(-inf) > inf I guess that you need to do some resource cleanup to remove record of old failures. It's interesting that you have three sets of node names (one from the config, another from showscores, and third from you). Whoever got confused. Thanks, Dejan > After "crm resource move groupApache nodeha2": > > The constraint is added: > location cli-prefer-groupApache groupApache \ > rule $id="cli-prefer-rule-groupApache" inf: #uname eq nodeha2 > > ./showscores.sh > Resource Score Node Stickiness #Fail > Migration-Threshold > resApache 100 clientisha1 100 0 > > resApache -INFINITY clientisha2 100 0 > > resDRBDApache:0 0 clientisha2 100 0 > > resDRBDApache:0 10100 clientisha1 100 0 > > resDRBDApache:0_(master) 10700 clientisha1 100 0 > > resDRBDApache:1 100 clientisha2 100 0 > > resDRBDApache:1 -INFINITY clientisha1 100 0 > > resDRBDApache:1_(master) -1 clientisha2 100 0 > > resDRBDPostgresql:0 0 clientisha2 100 0 > > resDRBDPostgresql:0 10100 clientisha1 100 0 > > resDRBDPostgresql:0_(master) 10700 clientisha1 100 0 > > resDRBDPostgresql:1 100 clientisha2 100 0 > > resDRBDPostgresql:1 -INFINITY clientisha1 100 0 > > resDRBDPostgresql:1_(master) -1 clientisha2 100 0 > > resFsApache 10450 clientisha1 100 0 > > resFsApache -INFINITY clientisha2 100 0 > > resFsPostgresql 10450 clientisha1 100 0 > > resFsPostgresql -INFINITY clientisha2 100 0 > > resIPApache 200 clientisha1 100 0 > > resIPApache -INFINITY clientisha2 100 0 > > resIPPostgresql 200 clientisha1 100 0 > > resIPPostgresql -INFINITY clientisha2 100 0 > > resPostgresql 100 clientisha1 100 0 > > resPostgresql -INFINITY clientisha2 100 0 > > The scores don't look like they are changing. > > The log looks like that: > Aug 03 16:33:11 nodeha2 cib: [4168]: info: cib_process_request: Operation > complete: op cib_delete for section constraints > (origin=nodeha1/crm_resource/3, version=0.69.2): ok (rc=0) > Aug 03 16:33:11 nodeha2 cib: [4168]: info: cib:diff: - <cib admin_epoch="0" > epoch="69" num_updates="2" /> > Aug 03 16:33:11 nodeha2 cib: [4168]: info: cib:diff: + <cib epoch="70" > num_updates="1" admin_epoch="0" validate-with="pacemaker-1.2" > crm_feature_set="3.0.6" update-origin="nodeha1" update-client="crm_resource" > cib-last-written="Fri Aug 3 16:31:40 2012" have-quorum="1" dc-uuid="nodeha2" > > > Aug 03 16:33:11 nodeha2 cib: [4168]: info: cib:diff: + <configuration > > Aug 03 16:33:11 nodeha2 cib: [4168]: info: cib:diff: + <constraints > > Aug 03 16:33:11 nodeha2 cib: [4168]: info: cib:diff: + <rsc_location > id="cli-prefer-groupApache" rsc="groupApache" __crm_diff_marker__="added:top" > > > Aug 03 16:33:11 nodeha2 cib: [4168]: info: cib:diff: + <rule id="cli- > prefer-rule-groupApache" score="INFINITY" boolean-op="and" > > Aug 03 16:33:11 nodeha2 cib: [4168]: info: cib:diff: + <expression > id="cli-prefer-expr-groupApache" attribute="#uname" operation="eq" > value="nodeha2" type="string" /> > Aug 03 16:33:11 nodeha2 crmd: [4173]: info: abort_transition_graph: > te_update_diff:126 - Triggered transition abort (complete=1, tag=diff, > id=(null), magic=NA, cib=0.70.1) : Non-status change > Aug 03 16:33:11 nodeha2 cib: [4168]: info: cib:diff: + </rule> > Aug 03 16:33:11 nodeha2 cib: [4168]: info: cib:diff: + </rsc_location> > Aug 03 16:33:11 nodeha2 cib: [4168]: info: cib:diff: + </constraints> > Aug 03 16:33:11 nodeha2 crmd: [4173]: notice: do_state_transition: State > transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL > origin=abort_transition_graph ] > Aug 03 16:33:11 nodeha2 cib: [4168]: info: cib:diff: + </configuration> > Aug 03 16:33:11 nodeha2 cib: [4168]: info: cib:diff: + </cib> > Aug 03 16:33:11 nodeha2 cib: [4168]: info: cib_process_request: Operation > complete: op cib_modify for section constraints > (origin=nodeha1/crm_resource/4, version=0.70.1): ok (rc=0) > Aug 03 16:33:11 nodeha2 pengine: [4172]: notice: unpack_config: On loss of > CCM > Quorum: Ignore > Aug 03 16:33:11 nodeha2 pengine: [4172]: notice: unpack_rsc_op: Operation > monitor found resource resDRBDPostgresql:0 active in master mode on nodeha1 > Aug 03 16:33:11 nodeha2 pengine: [4172]: notice: unpack_rsc_op: Operation > monitor found resource resDRBDApache:0 active in master mode on nodeha1 > Aug 03 16:33:11 nodeha2 crmd: [4173]: notice: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=handle_response ] > Aug 03 16:33:11 nodeha2 crmd: [4173]: info: do_te_invoke: Processing graph > 332 > (ref=pe_calc-dc-1344004391-579) derived from /var/lib/pengine/pe-input-87.bz2 > Aug 03 16:33:11 nodeha2 crmd: [4173]: notice: run_graph: ==== Transition 332 > (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pengine/pe-input-87.bz2): Complete > Aug 03 16:33:11 nodeha2 crmd: [4173]: notice: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_FSA_INTERNAL origin=notify_crmd ] > Aug 03 16:33:11 nodeha2 pengine: [4172]: notice: process_pe_message: > Transition 332: PEngine Input stored in: /var/lib/pengine/pe-input-87.bz2 > > Maybe I need to clear some counters or score caches? > > > > How can I debug such problems? > > > > Experience helps ;-) > > That's really true. And I'm actually in the process of gaining experience =) > > Cheers, > Tobias > > -- > Nine Internet Solutions AG, Albisriederstr. 243a, CH-8047 Zuerich > Support +41 44 637 40 40 | Tel +41 44 637 40 00 | Direct +41 44 637 40 13 > Skype nine.ch_support > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
