Hi,
I'd like to hear from you if someone has already experienced someting
similar or -in case- to get how to do better.
I need to sync a few files from time to time. Non need for a storage or a
DRBD solution. Active/Passive cluster,
I created a OCF master/slave resource from the template called "syncer",
adding in the RA_syncer::monitor() method a couple of rsync commands as the
following example:
...
"/usr/bin/rsync -avz --delete ${SOURCE_CFG_FOLDER_X}/
${OTHER_NODE}::ALIAS_X"
...
The ocf-tester says the RA is ok.
Then I add a constraint to pacecemaker, so that the resource_ms_syncer is
Master only if the other resources are master on the same node.
I have performed several tests, and the sync seems to work fine either on
node A or on node B if I perform some swaps in order to validate the
solution.
But in one case something goes wrong: while A is master, if I shut down the
B node after a while the RA::monitor timer on the A node seems to stop
working. Of course I expect to get some OCF log errors such as
"(resource_syncer:0:monitor:stderr) rsync: failed to connect to NODE_B", but
when I restart the NODE_B I would like the resync to start working again. To
make it so I must restart the HA services on the NODE_A, and this is not
acceptable :(
On the footer of this email I have attached the cib section for the ocf
resource I created. Could the problem be related to some timeout properties
I failed to set? Any suggestion?
Thanks a lot
G.
----
COMPONENTS:
OS: RHEL6.2 2.6.32-220.el6.x86_64
Pacemaker: pacemaker-1.1.6-3.el6.x86_64 (in bundle with the OS)
Corosync: corosync-1.4.1-4.el6.x86_64 (in bundle with the OS)
Rsync: rsync-3.0.6-5.el6_0.1.x86_64
RESOURCE:
<master id="resource_ms_syncer">
<meta_attributes id="resource_ms_syncer-meta_attributes">
<nvpair id="resource_ms_syncer-meta_attributes-master-max"
name="master-max" value="1"/>
<nvpair id="resource_ms_syncer-meta_attributes-master-node-max"
name="master-node-max" value="1"/>
<nvpair id="resource_ms_syncer-meta_attributes-clone-max"
name="clone-max" value="2"/>
<nvpair id="resource_ms_syncer-meta_attributes-clone-node-max"
name="clone-node-max" value="1"/>
<nvpair id="resource_ms_syncer-meta_attributes-notify"
name="notify" value="true"/>
<nvpair id="resource_ms_syncer-meta_attributes-target-role"
name="target-role" value="Started"/>
</meta_attributes>
<primitive class="ocf" id="resource_syncer" provider="resi"
type="syncer">
<instance_attributes id="resource_syncer-instance_attributes">
<nvpair id="resource_syncer-instance_attributes-state"
name="state" value="/var/run/resource_syncer.state"/>
<nvpair
id="resource_syncer-instance_attributes-internal_parameter"
name="internal_parameter" value="idle"/>
</instance_attributes>
<operations>
<op id="resource_syncer-startup_M" interval="30s" name="monitor"
role="Master"/>
<op enabled="false" id="resource_syncer-startup_S"
interval="40s" name="monitor" on-fail="restart" requires="nothing"
role="Slave" timeout="60s"/>
<op id="resource_syncer-start-0" interval="0" name="start"
timeout="80s"/>
<op id="resource_syncer-stop-0" interval="0" name="stop"
timeout="80s"/>
</operations>
</primitive>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems