I have three machines named anlutest1, anlutest2, and anlutest3 that I'm trying to get IP failover working on. I'm using heartbeat for the messaging layer, and everything works great when a machine goes down. But I also would like to failover an IP when EITHER the eth0 or eth1 network interfaces fail. From reading
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch09s03s03.html it seems the right way to do this is to add a ping resource. Here is my XML configuration: http://pastebin.com/05z7eB2s This config doesn't work for me. Using the showscores.sh script found at: http://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg00410.html I see that my scores are: Resource Score Node Stickiness #Fail Migration-Threshold address01 0 anlutest3 0 0 address01 1006 anlutest1 0 5 address01 50 anlutest2 0 157 address02 0 anlutest3 0 0 address02 1050 anlutest2 0 2 address02 6 anlutest1 0 0 address03 1000 anlutest3 0 7 address03 50 anlutest2 0 address03 6 anlutest1 0 0 ping:0 0 anlutest1 0 6 ping:0 0 anlutest2 0 14 ping:0 0 anlutest3 0 0 ping:1 0 anlutest2 0 ping:1 0 anlutest3 0 28 ping:1 -1000000 anlutest1 0 0 ping:2 0 anlutest3 0 13 ping:2 -1000000 anlutest1 0 0 ping:2 -1000000 anlutest2 0 which make no sense at all. I don't see how I could be getting these scores of 50 and 1006. When I take down an interface on anlutest3, I see scores of 4 and 1004, which sort of make sense, just the multiplier of 100 isn't working. I was experimenting with changing values, so maybe its caching old values. If so, how do I enforce the new values? Furthermore, shouldn't there be no scores of 0? If all 6 IPs I am pinging return successfully, shouldn't my scores be either 600 or 1600? In my syslog I also see a ton of messages like Feb 17 03:54:47 anlutest2 lrmd: [1137]: info: perform_op:2877: operations on resource address01 already delayed Feb 17 03:54:48 anlutest2 lrmd: [1137]: info: perform_op:2873: operation monitor[419] on ocf::ping::ping:1 for client 1140, its parameters: CRM_meta_clone=[1] host_list=[10.54.130.6 10.54.130.8 10.54.130.7 50.97.196.101 50.97.196.103 50.9CRM_meta_clone_max=[3] dampen=[60s] crm_feature_set=[3.0.1] CRM_meta_globally_unique=[false] multiplier=[10000] CRM_meta_name=[monitor] CRM_meta_timeout=[60000] CRM_meta_interval=[5000] for rsc is already running. Feb 17 03:54:48 anlutest2 lrmd: [1137]: info: perform_op:2883: postponing all ops on resource ping:1 by 1000 ms Feb 17 03:54:48 anlutest2 lrmd: [1137]: info: perform_op:2873: operation monitor[171] on ocf::ping::ping:2 for client 1140, its parameters: CRM_meta_clone=[2] host_list=[10.54.130.6 10.54.130.8 10.54.130.7 50.97.196.101 50.97.196.103 50.9CRM_meta_clone_max=[3] dampen=[60s] crm_feature_set=[3.0.1] CRM_meta_globally_unique=[false] multiplier=[1] CRM_meta_name=[monitor] CRM_meta_timeout=[30000] CRM_meta_interval=[5000] for rsc is already running. Feb 17 03:54:48 anlutest2 lrmd: [1137]: info: perform_op:2883: postponing all ops on resource ping:2 by 1000 ms and occasionally Feb 17 03:54:33 anlutest2 attrd: [1139]: info: attrd_trigger_update: Sending flush op to all hosts for: pingd (4000) Feb 17 03:54:33 anlutest2 attrd: [1139]: info: attrd_ha_callback: flush message from anlutest2 Feb 17 03:54:33 anlutest2 attrd: [1139]: WARN: find_nvpair_attr: Multiple attributes match name=pingd Feb 17 03:54:33 anlutest2 attrd: [1139]: info: find_nvpair_attr: Value: 50 #011(id=status-d619a94e-ebba-4ed0-8e0f-89837dd7506b-pingd) Feb 17 03:54:33 anlutest2 attrd: [1139]: info: find_nvpair_attr: Value: 3 #011(id=status-ab3c1a25-9471-48f7-9c0b-c76238abd402-pingd) Feb 17 03:54:33 anlutest2 attrd: [1139]: info: attrd_perform_update: Sent update -40: pingd=4000 Feb 17 03:54:33 anlutest2 attrd: [1139]: ERROR: attrd_cib_callback: Update -40 for pingd=4000 failed: Required data for this CIB API call not found Could someone just take a look at my config and let me know what I'm doing wrong? Or if there's a better way to do what I want to do... Thanks in advance, Anlu
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org