Dominik,
Here is the status of the two concerns I needed help on.
01) When a node comes back up after a restart of heartbeat, resources gets
bounced when it rejoins the cluster.
STATUS: The resources still gets bounced when a node joins the cluster even if
I had deleted all the constraints.
02) Stopping one resource in a group does not failover the group to the other
node.
STATUS: migration-threshold works like a charm. :) Thanks.
If I may, I have another concern that popped up.
03) I cannot seem to get MailTo to work. I am trying to add this resource
under the Directory_Server group so that everytime a failover is experienced,
it will notify me that it did.
Below is the current cib.xml file I have.
<cib admin_epoch="0" validate-with="pacemaker-1.0" crm_feature_set="3.0"
have-quorum="1" dc-uuid="27f54ec3-b626-4b4f-b8a6-4ed0b768513c" epoch="99"
num_updates="0" cib-last-written="Tue Jan 27 12:59:21 2009">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
value="1.0.1-node: 6fc5ce8302abf145a02891ec41e5a492efbe8efe"/>
</cluster_property_set>
</crm_config>
<nodes>
<node id="5e3e3c2d-55e7-4c51-90be-5c4a1912bf3e" uname="nomen.esri.com"
type="normal">
<instance_attributes id="nodes-5e3e3c2d-55e7-4c51-90be-5c4a1912bf3e">
<nvpair id="standby-5e3e3c2d-55e7-4c51-90be-5c4a1912bf3e"
name="standby" value="off"/>
</instance_attributes>
</node>
<node id="27f54ec3-b626-4b4f-b8a6-4ed0b768513c" uname="rubric.esri.com"
type="normal">
<instance_attributes id="nodes-27f54ec3-b626-4b4f-b8a6-4ed0b768513c">
<nvpair id="standby-27f54ec3-b626-4b4f-b8a6-4ed0b768513c"
name="standby" value="off"/>
</instance_attributes>
</node>
</nodes>
<resources>
<group id="Directory_Server">
<meta_attributes id="Directory_Server-meta_attributes">
<nvpair id="Directory_Server-meta_attributes-collocated"
name="collocated" value="true"/>
<nvpair id="Directory_Server-meta_attributes-ordered" name="ordered"
value="true"/>
<nvpair id="Directory_Server-meta_attributes-migration-threshold"
name="migration-threshold" value="1"/>
<nvpair id="Directory_Server-meta_attributes-failure-timeout"
name="failure-timeout" value="10s"/>
</meta_attributes>
<primitive class="ocf" id="VIP" provider="heartbeat" type="IPaddr">
<instance_attributes id="VIP-instance_attributes">
<nvpair id="VIP-instance_attributes-ip" name="ip"
value="10.50.26.250"/>
</instance_attributes>
<operations id="VIP-ops">
<op id="VIP-monitor-5s" interval="5s" name="monitor" timeout="5s"/>
</operations>
</primitive>
<primitive class="ocf" id="ECAS" provider="esri" type="ecas">
<operations id="ECAS-ops">
<op id="ECAS-monitor-3s" interval="3s" name="monitor" timeout="3s"/>
</operations>
</primitive>
<primitive class="ocf" id="FDS_Admin" provider="esri" type="fdsadm">
<operations id="FDS_Admin-ops">
<op id="FDS_Admin-monitor-3s" interval="3s" name="monitor"
timeout="3s"/>
</operations>
</primitive>
<primitive class="ocf" provider="heartbeat" type="MailTo"
id="Emergency_Contact">
<instance_attributes id="Emergency_Contact-instance_attributes">
<nvpair id="Emergency_Contact-instance_attributes-email"
name="email" value="[email protected]"/>
<nvpair id="Emergency_Contact-instance_attributes-subject"
name="subject" value="Failover Occured"/>
</instance_attributes>
<operations id="Emergency_Contact-ops">
<op interval="3s" name="monitor" timeout="3s"
id="Emergency_Contact-monitor-3s"/>
</operations>
</primitive>
</group>
</resources>
<constraints/>
</configuration>
</cib>
Help.
Regards,
jerome
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Dominik Klein
Sent: Monday, January 26, 2009 10:52 PM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Failover not working as I expected
Jerome Yanga wrote:
> Andrew,
>
> I apologize for my sending my previous email abruptly.
>
> I have followed your recommendation and installed Pacemaker.
>
> Here is my config.
>
> Packages Installed:
> heartbeat-2.99.2-6.1
> heartbeat-common-2.99.2-6.1
> heartbeat-debug-2.99.2-6.1
> heartbeat-ldirectord-2.99.2-6.1
> heartbeat-resources-2.99.2-6.1
> libheartbeat2-2.99.2-6.1
> libpacemaker3-1.0.1-3.1
> pacemaker-1.0.1-3.1
> pacemaker-debug-1.0.1-3.1
> pacemaker-pygui-1.4-11.9
> pacemaker-pygui-debug-1.4-11.9
>
>
>
> ha.cf:
> # Logging
> debug 1
> use_logd false
> logfacility daemon
>
> # Misc Options
> traditional_compression off
> compression bz2
> coredumps true
>
> # Communications
> udpport 691
> bcast eth1 eth0
> autojoin any
>
> # Thresholds (in seconds)
> keepalive 1
> warntime 6
> deadtime 10
> initdead 15
>
> ping 10.50.254.254
> crm respawn
> apiauth mgmtd uid=root
> respawn root /usr/lib/heartbeat/mgmtd -v
>
>
> cib.xml:
> <cib admin_epoch="0" validate-with="pacemaker-1.0" crm_feature_set="3.0"
> have-quorum="1" epoch="57" dc-uuid="5e3e3c2d-55e7-4c51-90be-5c4a1912bf3e"
> num_updates="0" cib-last-written="Mon Jan 26 13:57:32 2009">
> <configuration>
> <crm_config>
> <cluster_property_set id="cib-bootstrap-options">
> <nvpair id="cib-bootstrap-options-dc-version" name="dc-version"
> value="1.0.1-node: 6fc5ce8302abf145a02891ec41e5a492efbe8efe"/>
> </cluster_property_set>
> </crm_config>
> <nodes>
> <node id="5e3e3c2d-55e7-4c51-90be-5c4a1912bf3e" uname="nomen.esri.com"
> type="normal">
> <instance_attributes id="nodes-5e3e3c2d-55e7-4c51-90be-5c4a1912bf3e">
> <nvpair id="standby-5e3e3c2d-55e7-4c51-90be-5c4a1912bf3e"
> name="standby" value="off"/>
> </instance_attributes>
> </node>
> <node id="27f54ec3-b626-4b4f-b8a6-4ed0b768513c" uname="rubric.esri.com"
> type="normal">
> <instance_attributes id="nodes-27f54ec3-b626-4b4f-b8a6-4ed0b768513c">
> <nvpair id="standby-27f54ec3-b626-4b4f-b8a6-4ed0b768513c"
> name="standby" value="off"/>
> </instance_attributes>
> </node>
> </nodes>
> <resources>
> <group id="Directory_Server">
> <meta_attributes id="Directory_Server-meta_attributes">
> <nvpair id="Directory_Server-meta_attributes-collocated"
> name="collocated" value="true"/>
> <nvpair id="Directory_Server-meta_attributes-ordered"
> name="ordered" value="true"/>
> <nvpair id="Directory_Server-meta_attributes-resource_stickiness"
> name="resource_stickiness" value="100"/>
> </meta_attributes>
> <primitive class="ocf" id="VIP" provider="heartbeat" type="IPaddr">
> <instance_attributes id="VIP-instance_attributes">
> <nvpair id="VIP-instance_attributes-ip" name="ip"
> value="10.50.26.250"/>
> </instance_attributes>
> <operations id="VIP-ops">
> <op id="VIP-monitor-5s" interval="5s" name="monitor"
> timeout="5s"/>
> </operations>
> </primitive>
> <primitive class="ocf" id="ECAS" provider="esri" type="ecas">
> <operations id="ECAS-ops">
> <op id="ECAS-monitor-3s" interval="3s" name="monitor"
> timeout="3s"/>
> </operations>
> <meta_attributes id="ECAS-meta_attributes">
> <nvpair id="ECAS-meta_attributes-target-role" name="target-role"
> value="Started"/>
> </meta_attributes>
> </primitive>
> <primitive class="ocf" id="FDS_Admin" provider="esri" type="fdsadm">
> <operations id="FDS_Admin-ops">
> <op id="FDS_Admin-monitor-3s" interval="3s" name="monitor"
> timeout="3s"/>
> </operations>
> </primitive>
> </group>
> </resources>
> <constraints>
> <rsc_location id="cli-prefer-Directory_Server" rsc="Directory_Server">
> <rule id="cli-prefer-rule-Directory_Server" score="INFINITY"
> boolean-op="and">
> <expression id="cli-prefer-expr-Directory_Server"
> attribute="#uname" operation="eq" value="rubric.esri.com" type="string"/>
> </rule>
> </rsc_location>
> <rsc_location id="cli-prefer-FDS_Admin" rsc="FDS_Admin">
> <rule id="cli-prefer-rule-FDS_Admin" score="INFINITY"
> boolean-op="and">
> <expression id="cli-prefer-expr-FDS_Admin" attribute="#uname"
> operation="eq" value="nomen.esri.com" type="string"/>
> </rule>
> </rsc_location>
> </constraints>
> </configuration>
> </cib>
>
>
>
> I still have the following issues when I only had heartbeat 2.1.3-1. My
> concerns are still as follows:
>
> 01) When a node comes back up after a restart of heartbeat, resources gets
> bounced when it rejoins the cluster.
Well, you have defined rsc_location constraints with a score of
INFINITY, so that is expected.
> 02) Stopping one resource in a group does not failover the group to the
> other node.
Lookup migration-threshold.
Regards
Dominik
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems