Hi All, I am trying to configure pacemaker (1.0.10) to make a single filesystem highly available by two nodes (please don't be distracted by the dangers of multiply mounted filesystems and clustering filesystems, etc., as I am absolutely clear about that -- consider that I am using a filesystem resource as just an example if you wish). Here is my filesystem resource description:
node foo1 node foo2 \ attributes standby="off" primitive OST1 ocf:heartbeat:Filesystem \ meta target-role="Started" \ operations $id="BAR1-operations" \ op monitor interval="120" timeout="60" \ op start interval="0" timeout="300" \ op stop interval="0" timeout="300" \ params device="/dev/disk/by-uuid/8c500092-5de6-43d7-b59a-ef91fa9667b9" directory="/mnt/bar1" fstype="ext3" primitive st-pm stonith:external/powerman \ params serverhost="192.168.122.1:10101" poweroff="0" clone fencing st-pm property $id="cib-bootstrap-options" \ dc-version="1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3" \ cluster-infrastructure="openais" \ expected-quorum-votes="1" \ no-quorum-policy="ignore" \ last-lrm-refresh="1306783242" \ default-resource-stickiness="1000" rsc_defaults $id="rsc-options" \ resource-stickiness="100" The two problems I have run into are: 1. preventing the resource from failing back to the node it was previously on after it has failed over and the previous node has been restored. Basically what's documented at http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch05s03s02.html 2. preventing the active node from being STONITHed when the resource is moved back to it's failed-and-restored node after a failover. IOW: BAR1 is available on foo1, which fails and the resource is moved to foo2. foo1 returns and the resource is failed back to foo1, but in doing that foo2 is STONITHed. For #1, as you can see, I tried setting the default resource stickiness to 100. That didn't seem to work. When I stopped corosync on the active node, the service failed over but it promptly failed back when I started corosync again, contrary to the example on the referenced URL. Subsequently I (think I) tried adding the specific resource stickiness of 1000. That didn't seem to help either. As for #2, the issue with STONITHing foo2 when failing back to foo1 is that foo1 and foo2 are an active/active pair of servers. STONITHing foo2 just to restore foo1's services puts foo2's services out of service, I do want a node that is believed to be dead to be STONITHed before it's resource(s) are failed over though. Any hints on what I am doing wrong? Thanx and cheers, b.
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker