Yesterday, the last few emails between Vadym and I were inadvertently not posted to this list. Here are those posts for anyone having similar issues.
Regards, Craig. On 7 October 2010 15:20, Vadym Chepkov <vchep...@gmail.com> wrote: > no, default is 0 - it is not taken into consideration at all. > Resource stays in place because allocation on the other host has the same > score. > You can see all computed scores using ptest -sL > > You don't need to specify $id= , it's redundant, by the way > > Vadym > > On Oct 6, 2010, at 9:59 PM, Craig Hurley wrote: > >> Thanks again and I see what you mean; I unplugged eth0 from both nodes >> and g_cluster_services went down on both nodes. I took your advice >> onboard and read this section: >> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/index.html#id771622 >> >> ... and I've configured the location rule so that g_cluster_services >> runs on the node with most connections: >> >> primitive p_ping ocf:pacemaker:ping \ >> params name="p_ping" host_list="172.20.0.254 172.20.50.1 >> 172.20.50.2" multiplier="1000" \ >> op monitor interval="20s" >> clone c_ping p_ping \ >> meta globally-unique="false" >> location loc_ping g_cluster_services \ >> rule $id="loc_ping-rule" p_ping: defined p_ping >> >> Now if I unplug eth0 from both nodes, g_cluster_services remains up on >> one of the nodes, this suits my requirements :) >> >> One last item: in my config I have not specified a resource >> stickiness, and the master role and g_cluster_services move around as >> expected when a node fails, now when a failed node comes back online, >> the master role and g_cluster_services stay where they are (until the >> next forced fail over) -- which is the behaviour I require. Is there >> a default stickiness that causes this "correct" behaviour? >> >> Regards, >> Craig. >> >> >> On 7 October 2010 11:54, Vadym Chepkov <vchep...@gmail.com> wrote: >>> monitor operation is essential for ping RA, otherwise it won't work too >>> >>> As for the multiplier - it's all about the score and resource stickiness >>> with multiplier 200, and resource stickiness set to 500, for example, >>> when both hosts can ping up to 2 ping nodes they will stay where they are, >>> but if one host can ping 3 ping nodes but another just 2 - >>> this will make resources to relocate to better connected host. >>> >>> In a simple example I gave you, if this is the ip of a router for both >>> nodes and it will go down, this will cause the resource not to failover, >>> but just go down, so if this is not what you want, you would probably ping >>> not just the router, but both nodes IPs as well and only if you able to >>> ping only yourself you would failover: >>> >>> location rg0-connected rg0 \ >>> rule -inf: not_defined pingd or pingd lte 200 >>> >>> Vadym >>> >>> On Oct 6, 2010, at 5:56 PM, Craig Hurley wrote: >>> >>>> Thanks Vadym, this worked. It seems the missing name field was >>>> causing the problem. >>>> >>>> On a related note, why do you have a multiplier of 200? >>>> >>>> According to >>>> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch09s03s03.html, >>>> the multiplier field is "The number by which to multiply the number of >>>> connected ping nodes by. Useful when there are multiple ping nodes >>>> configured." >>>> >>>> I don't understand why one would want to multiply the number of >>>> connected nodes when there are multiple ping nodes :/ >>>> >>>> Regards, >>>> Craig. >>>> >>>> On 7 October 2010 09:37, Vadym Chepkov <vchep...@gmail.com> wrote: >>>>> This is my config that works fine >>>>> >>>>> primitive ping ocf:pacemaker:ping \ >>>>> params name="pingd" host_list="10.10.10.250" multiplier="200" >>>>> timeout="5" \ >>>>> op monitor interval="10" >>>>> >>>>> clone connected ping \ >>>>> meta globally-unique="false" >>>>> >>>>> location rg0-connected rg0 \ >>>>> rule -inf: not_defined pingd or pingd lte 0 >>>>> >>>>> >>>>> On Oct 6, 2010, at 4:21 PM, Craig Hurley wrote: >>>>> >>>>>> I tried using ping instead of pingd and I added "number" to the >>>>>> evaluation, I get the same results :/ >>>>>> >>>>>> primitive p_ping ocf:pacemaker:ping params host_list=172.20.0.254 >>>>>> clone c_ping p_ping meta globally-unique=false >>>>>> location loc_ping g_cluster_services rule -inf: not_defined p_ping or >>>>>> p_ping number:lte 0 >>>>>> >>>>>> Regards, >>>>>> Craig. >>>>>> >>>>>> >>>>>> On 6 October 2010 20:43, Jayakrishnan <jayakrishnan...@gmail.com> wrote: >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> Guess the change:-- >>>>>>> location loc_pingd g_cluster_services rule -inf: not_defined pingd or >>>>>>> pingd >>>>>>> number:lte 0 >>>>>>> >>>>>>> should work >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Regards, >>>>>>> >>>>>>> Jayakrishnan. L >>>>>>> >>>>>>> Visit: >>>>>>> www.foralllinux.blogspot.com >>>>>>> www.jayakrishnan.bravehost.com >>>>>>> >>>>>>> >>>>>>> On Wed, Oct 6, 2010 at 11:56 AM, Claus Denk <d...@us.es> wrote: >>>>>>>> >>>>>>>> I am having a similar problem, so let's wait for the experts, But in >>>>>>>> the >>>>>>>> meanwhile, try changing >>>>>>>> >>>>>>>> >>>>>>>> location loc_pingd g_cluster_services rule -inf: not_defined p_pingd >>>>>>>> or p_pingd lte 0 >>>>>>>> >>>>>>>> to >>>>>>>> >>>>>>>> location loc_pingd g_cluster_services rule -inf: not_defined pingd >>>>>>>> or pingd number:lte 0 >>>>>>>> >>>>>>>> and see what happens. As far as I have read, it is also more >>>>>>>> recommended >>>>>>>> to use the "ping" >>>>>>>> resource instead of "pingd"... >>>>>>>> >>>>>>>> kind regards, Claus >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 10/06/2010 05:45 AM, Craig Hurley wrote: >>>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> I have a 2 node cluster, running DRBD, heartbeat and pacemaker in >>>>>>>>> active/passive mode. On both nodes, eth0 is connected to the main >>>>>>>>> network, eth1 is used to connect the nodes directly to each other. >>>>>>>>> The nodes share a virtual IP address on eth0. Pacemaker is also >>>>>>>>> controlling a custom service with an LSB compliant script in >>>>>>>>> /etc/init.d/. All of this is working fine and I'm happy with it. >>>>>>>>> >>>>>>>>> I'd like to configure the nodes so that they fail over if eth0 goes >>>>>>>>> down (or if they cannot access a particular gateway), so I tried >>>>>>>>> adding the following (as per >>>>>>>>> http://www.clusterlabs.org/wiki/Example_configurations#Set_up_pingd) >>>>>>>>> >>>>>>>>> primitive p_pingd ocf:pacemaker:pingd params host_list=172.20.0.254 op >>>>>>>>> monitor interval=15s timeout=5s >>>>>>>>> clone c_pingd p_pingd meta globally-unique=false >>>>>>>>> location loc_pingd g_cluster_services rule -inf: not_defined p_pingd >>>>>>>>> or p_pingd lte 0 >>>>>>>>> >>>>>>>>> ... but when I do add that, all resource are stopped and they don't >>>>>>>>> come back up on either node. Am I making a basic mistake or do you >>>>>>>>> need more info from me? >>>>>>>>> >>>>>>>>> All help is appreciated, >>>>>>>>> Craig. >>>>>>>>> >>>>>>>>> >>>>>>>>> pacemaker >>>>>>>>> Version: 1.0.8+hg15494-2ubuntu2 >>>>>>>>> >>>>>>>>> heartbeat >>>>>>>>> Version: 1:3.0.3-1ubuntu1 >>>>>>>>> >>>>>>>>> drbd8-utils >>>>>>>>> Version: 2:8.3.7-1ubuntu2.1 >>>>>>>>> >>>>>>>>> >>>>>>>>> r...@rpalpha:~$ sudo crm configure show >>>>>>>>> node $id="32482293-7b0f-466e-b405-c64bcfa2747d" rpalpha >>>>>>>>> node $id="3f2aac12-05aa-4ac7-b91f-c47fa28efb44" rpbravo >>>>>>>>> primitive p_drbd_data ocf:linbit:drbd \ >>>>>>>>> params drbd_resource="data" \ >>>>>>>>> op monitor interval="30s" >>>>>>>>> primitive p_fs_data ocf:heartbeat:Filesystem \ >>>>>>>>> params device="/dev/drbd/by-res/data" directory="/mnt/data" >>>>>>>>> fstype="ext4" >>>>>>>>> primitive p_ip ocf:heartbeat:IPaddr2 \ >>>>>>>>> params ip="172.20.50.3" cidr_netmask="255.255.0.0" nic="eth0" >>>>>>>>> \ >>>>>>>>> op monitor interval="30s" >>>>>>>>> primitive p_rp lsb:rp \ >>>>>>>>> op monitor interval="30s" \ >>>>>>>>> meta target-role="Started" >>>>>>>>> group g_cluster_services p_ip p_fs_data p_rp >>>>>>>>> ms ms_drbd p_drbd_data \ >>>>>>>>> meta master-max="1" master-node-max="1" clone-max="2" >>>>>>>>> clone-node-max="1" notify="true" >>>>>>>>> location loc_preferred_master g_cluster_services inf: rpalpha >>>>>>>>> colocation colo_mnt_on_master inf: g_cluster_services ms_drbd:Master >>>>>>>>> order ord_mount_after_drbd inf: ms_drbd:promote >>>>>>>>> g_cluster_services:start >>>>>>>>> property $id="cib-bootstrap-options" \ >>>>>>>>> dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \ >>>>>>>>> cluster-infrastructure="Heartbeat" \ >>>>>>>>> no-quorum-policy="ignore" \ >>>>>>>>> stonith-enabled="false" \ >>>>>>>>> expected-quorum-votes="2" \ >>>>>>>>> >>>>>>>>> >>>>>>>>> r...@rpalpha:~$ sudo cat /etc/ha.d/ha.cf >>>>>>>>> node rpalpha >>>>>>>>> node rpbravo >>>>>>>>> >>>>>>>>> keepalive 2 >>>>>>>>> warntime 5 >>>>>>>>> deadtime 15 >>>>>>>>> initdead 60 >>>>>>>>> >>>>>>>>> mcast eth0 239.0.0.43 694 1 0 >>>>>>>>> bcast eth1 >>>>>>>>> >>>>>>>>> use_logd yes >>>>>>>>> autojoin none >>>>>>>>> crm respawn >>>>>>>>> >>>>>>>>> >>>>>>>>> r...@rpalpha:~$ sudo cat /etc/drbd.conf >>>>>>>>> global { >>>>>>>>> usage-count no; >>>>>>>>> } >>>>>>>>> common { >>>>>>>>> protocol C; >>>>>>>>> >>>>>>>>> handlers {} >>>>>>>>> >>>>>>>>> startup {} >>>>>>>>> >>>>>>>>> disk {} >>>>>>>>> >>>>>>>>> net { >>>>>>>>> cram-hmac-alg sha1; >>>>>>>>> shared-secret "foobar"; >>>>>>>>> } >>>>>>>>> >>>>>>>>> syncer { >>>>>>>>> verify-alg sha1; >>>>>>>>> rate 100M; >>>>>>>>> } >>>>>>>>> } >>>>>>>>> resource data { >>>>>>>>> device /dev/drbd0; >>>>>>>>> meta-disk internal; >>>>>>>>> on rpalpha { >>>>>>>>> disk /dev/mapper/rpalpha-data; >>>>>>>>> address 192.168.1.1:7789; >>>>>>>>> } >>>>>>>>> on rpbravo { >>>>>>>>> disk /dev/mapper/rpbravo-data; >>>>>>>>> address 192.168.1.2:7789; >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>> >>>>>> Project Home: http://www.clusterlabs.org >>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>> Bugs: >>>>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >>>>> >>>>> >>> >>> > > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker