Hi! I have node A (armani) and node B (bulgari). I have configured a virtual IP address and Kamailio (SIP proxy) as cloned resource running on both nodes. If Kamailio failes two times on the node with the IP address, the IP address is migrated.
The problem happens when Kamailio fails on both nodes at the same time, then the cloned resource gets swapped between node A and B, and the IP address is not migrated anymore in case of Kamailio failures. See attached log for details of the scenario. The config is below. Is there something wrong with my config to handle this properly or is it a bug? I think the problem is related that the clones resource is swapped during downtime of both Kamailio resources. Is it possible to stick a cloned resource to a certain node? Thanks Klaus node armani \ attributes standby="off" node bulgari \ attributes standby="off" primitive failover-ip ocf:heartbeat:IPaddr \ params ip="83.136.32.161" \ op monitor interval="3s" primitive kamailio lsb:kamailio \ meta migration-threshold="2" failure-timeout="30" \ op monitor interval="10" timeout="10" clone cloneKamailio kamailio colocation colo_ip_with_kamailio inf: failover-ip cloneKamailio property $id="cib-bootstrap-options" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ cluster-recheck-interval="5s" \ dc-version="1.0.9-da7075976b5ff0bee71074385f8fd02f296ec8a3" rsc_defaults $id="rsc-options" \ resource-stickiness="5"
===================================================== Corosync started ----------------------------------------------------- ============ Last updated: Thu Mar 3 11:41:06 2011 Stack: openais Current DC: armani - partition with quorum Version: 1.0.9-da7075976b5ff0bee71074385f8fd02f296ec8a3 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ bulgari armani ] failover-ip (ocf::heartbeat:IPaddr): Started bulgari Clone Set: cloneKamailio Started: [ bulgari armani ] Operations: * Node armani: kamailio:1: migration-threshold=2 + (4) start: rc=0 (ok) + (5) monitor: interval=10000ms rc=0 (ok) * Node bulgari: kamailio:0: migration-threshold=2 + (5) start: rc=0 (ok) + (7) monitor: interval=10000ms rc=0 (ok) failover-ip: migration-threshold=1000000 + (4) start: rc=0 (ok) + (6) monitor: interval=3000ms rc=0 (ok) ===================================================== armani: /etc/init.d/kamailio stop Kamailio will be restarted immediately, failure-count increased, IP resides on bulgari ----------------------------------------------------- Online: [ bulgari armani ] failover-ip (ocf::heartbeat:IPaddr): Started bulgari Clone Set: cloneKamailio Started: [ bulgari armani ] Operations: * Node armani: kamailio:1: migration-threshold=2 fail-count=1 last-failure='Thu Mar 3 11:47:36 2011' + (6) stop: rc=0 (ok) + (7) start: rc=0 (ok) + (8) monitor: interval=10000ms rc=0 (ok) * Node bulgari: kamailio:0: migration-threshold=2 + (5) start: rc=0 (ok) + (7) monitor: interval=10000ms rc=0 (ok) failover-ip: migration-threshold=1000000 + (4) start: rc=0 (ok) + (6) monitor: interval=3000ms rc=0 (ok) ===================================================== bulgari: /etc/init.d/kamailio stop Kamailio will be restarted immediately, failure-count increased, IP resides on bulgari ----------------------------------------------------- Online: [ bulgari armani ] failover-ip (ocf::heartbeat:IPaddr): Started bulgari Clone Set: cloneKamailio Started: [ bulgari armani ] Operations: * Node armani: kamailio:1: migration-threshold=2 fail-count=1 last-failure='Thu Mar 3 11:47:36 2011' + (6) stop: rc=0 (ok) + (7) start: rc=0 (ok) + (8) monitor: interval=10000ms rc=0 (ok) * Node bulgari: kamailio:0: migration-threshold=2 fail-count=1 last-failure='Thu Mar 3 11:49:47 2011' + (8) stop: rc=0 (ok) + (9) start: rc=0 (ok) + (10) monitor: interval=10000ms rc=0 (ok) failover-ip: migration-threshold=1000000 + (4) start: rc=0 (ok) + (6) monitor: interval=3000ms rc=0 (ok) ===================================================== armani: /etc/init.d/kamailio stop Kamailio failed second time, failure-count increased, will stay stopped for 30 seconds, IP resides on bulgari ----------------------------------------------------- Online: [ bulgari armani ] failover-ip (ocf::heartbeat:IPaddr): Started bulgari Clone Set: cloneKamailio Started: [ bulgari ] Stopped: [ kamailio:1 ] Operations: * Node armani: kamailio:1: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:27 2011' + (7) start: rc=0 (ok) + (8) monitor: interval=10000ms rc=7 (not running) + (9) stop: rc=0 (ok) * Node bulgari: kamailio:0: migration-threshold=2 fail-count=1 last-failure='Thu Mar 3 11:49:47 2011' + (8) stop: rc=0 (ok) + (9) start: rc=0 (ok) + (10) monitor: interval=10000ms rc=0 (ok) failover-ip: migration-threshold=1000000 + (4) start: rc=0 (ok) + (6) monitor: interval=3000ms rc=0 (ok) Failed actions: kamailio:1_monitor_10000 (node=armani, call=8, rc=7, status=complete): not running ===================================================== while Kamailio is still stopped on armani: bulgari: /etc/init.d/kamailio stop Kamailio failed second time on bulgari, failure-count increased, will stay stopped for 30 seconds, IP is stoppped ----------------------------------------------------- Online: [ bulgari armani ] Operations: * Node armani: kamailio:1: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:27 2011' + (7) start: rc=0 (ok) + (8) monitor: interval=10000ms rc=7 (not running) + (9) stop: rc=0 (ok) * Node bulgari: kamailio:0: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:37 2011' + (9) start: rc=0 (ok) + (10) monitor: interval=10000ms rc=7 (not running) + (11) stop: rc=0 (ok) failover-ip: migration-threshold=1000000 + (4) start: rc=0 (ok) + (6) monitor: interval=3000ms rc=0 (ok) + (12) stop: rc=0 (ok) Failed actions: kamailio:1_monitor_10000 (node=armani, call=8, rc=7, status=complete): not running kamailio:0_monitor_10000 (node=bulgari, call=10, rc=7, status=complete): not running ===================================================== failure-timeout triggers and starts Kamailio on armani, the cloned resource is "swapped" between the two nodes IP is started on armani ----------------------------------------------------- Online: [ bulgari armani ] failover-ip (ocf::heartbeat:IPaddr): Started armani Clone Set: cloneKamailio Started: [ armani ] Stopped: [ kamailio:1 ] Operations: * Node armani: kamailio:1: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:27 2011' + (7) start: rc=0 (ok) + (8) monitor: interval=10000ms rc=7 (not running) + (9) stop: rc=0 (ok) failover-ip: migration-threshold=1000000 + (10) start: rc=0 (ok) + (12) monitor: interval=3000ms rc=0 (ok) kamailio:0: migration-threshold=2 last-failure='Thu Mar 3 11:50:27 2011' + (11) start: rc=0 (ok) + (13) monitor: interval=10000ms rc=0 (ok) * Node bulgari: kamailio:0: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:37 2011' + (9) start: rc=0 (ok) + (10) monitor: interval=10000ms rc=7 (not running) + (11) stop: rc=0 (ok) failover-ip: migration-threshold=1000000 + (4) start: rc=0 (ok) + (6) monitor: interval=3000ms rc=0 (ok) + (12) stop: rc=0 (ok) Failed actions: kamailio:0_monitor_10000 (node=bulgari, call=10, rc=7, status=complete): not running ===================================================== failure-timeout triggers and starts Kamailio on bulgari, IP resides on armani ----------------------------------------------------- Online: [ bulgari armani ] failover-ip (ocf::heartbeat:IPaddr): Started armani Clone Set: cloneKamailio Started: [ armani bulgari ] Operations: * Node armani: kamailio:1: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:27 2011' + (7) start: rc=0 (ok) + (8) monitor: interval=10000ms rc=7 (not running) + (9) stop: rc=0 (ok) failover-ip: migration-threshold=1000000 + (10) start: rc=0 (ok) + (12) monitor: interval=3000ms rc=0 (ok) kamailio:0: migration-threshold=2 last-failure='Thu Mar 3 11:50:27 2011' + (11) start: rc=0 (ok) + (13) monitor: interval=10000ms rc=0 (ok) * Node bulgari: kamailio:0: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:37 2011' + (9) start: rc=0 (ok) + (10) monitor: interval=10000ms rc=7 (not running) + (11) stop: rc=0 (ok) failover-ip: migration-threshold=1000000 + (4) start: rc=0 (ok) + (6) monitor: interval=3000ms rc=0 (ok) + (12) stop: rc=0 (ok) kamailio:1: migration-threshold=2 last-failure='Thu Mar 3 11:50:37 2011' + (13) start: rc=0 (ok) + (14) monitor: interval=10000ms rc=0 (ok) ===================================================== now, regardless how often I stop Kamailio on armani, armani: /etc/init.d/kamailio stop failure-count for kamailio:0 is increased but Kamailio always will be re-started immediately (not waiting failure-timeout) and IP will never migrate to bulgari ----------------------------------------------------- Online: [ bulgari armani ] failover-ip (ocf::heartbeat:IPaddr): Started armani Clone Set: cloneKamailio Started: [ armani bulgari ] Operations: * Node armani: kamailio:1: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:27 2011' + (7) start: rc=0 (ok) + (8) monitor: interval=10000ms rc=7 (not running) + (9) stop: rc=0 (ok) failover-ip: migration-threshold=1000000 + (10) start: rc=0 (ok) + (12) monitor: interval=3000ms rc=0 (ok) kamailio:0: migration-threshold=2 fail-count=1 last-failure='Thu Mar 3 11:50:27 2011' + (14) stop: rc=0 (ok) + (15) start: rc=0 (ok) + (16) monitor: interval=10000ms rc=0 (ok) * Node bulgari: kamailio:0: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:37 2011' + (9) start: rc=0 (ok) + (10) monitor: interval=10000ms rc=7 (not running) + (11) stop: rc=0 (ok) failover-ip: migration-threshold=1000000 + (4) start: rc=0 (ok) + (6) monitor: interval=3000ms rc=0 (ok) + (12) stop: rc=0 (ok) kamailio:1: migration-threshold=2 last-failure='Thu Mar 3 11:50:37 2011' + (13) start: rc=0 (ok) + (14) monitor: interval=10000ms rc=0 (ok) ===================================================== armani: /etc/init.d/kamailio stop kamailio:0: fail-count=2, no migration... ----------------------------------------------------- Online: [ bulgari armani ] failover-ip (ocf::heartbeat:IPaddr): Started armani Clone Set: cloneKamailio Started: [ armani bulgari ] Operations: * Node armani: kamailio:1: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:27 2011' + (7) start: rc=0 (ok) + (8) monitor: interval=10000ms rc=7 (not running) + (9) stop: rc=0 (ok) failover-ip: migration-threshold=1000000 + (10) start: rc=0 (ok) + (12) monitor: interval=3000ms rc=0 (ok) kamailio:0: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:27 2011' + (17) stop: rc=0 (ok) + (18) start: rc=0 (ok) + (19) monitor: interval=10000ms rc=0 (ok) * Node bulgari: kamailio:0: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:37 2011' + (9) start: rc=0 (ok) + (10) monitor: interval=10000ms rc=7 (not running) + (11) stop: rc=0 (ok) failover-ip: migration-threshold=1000000 + (4) start: rc=0 (ok) + (6) monitor: interval=3000ms rc=0 (ok) + (12) stop: rc=0 (ok) kamailio:1: migration-threshold=2 last-failure='Thu Mar 3 11:50:37 2011' + (13) start: rc=0 (ok) + (14) monitor: interval=10000ms rc=0 (ok) ===================================================== armani: /etc/init.d/kamailio stop kamailio:0: fail-count=3, no migration... ----------------------------------------------------- Online: [ bulgari armani ] failover-ip (ocf::heartbeat:IPaddr): Started armani Clone Set: cloneKamailio Started: [ armani bulgari ] Operations: * Node armani: kamailio:1: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:27 2011' + (7) start: rc=0 (ok) + (8) monitor: interval=10000ms rc=7 (not running) + (9) stop: rc=0 (ok) failover-ip: migration-threshold=1000000 + (10) start: rc=0 (ok) + (12) monitor: interval=3000ms rc=0 (ok) kamailio:0: migration-threshold=2 fail-count=3 last-failure='Thu Mar 3 11:50:27 2011' + (20) stop: rc=0 (ok) + (21) start: rc=0 (ok) + (22) monitor: interval=10000ms rc=0 (ok) * Node bulgari: kamailio:0: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:37 2011' + (9) start: rc=0 (ok) + (10) monitor: interval=10000ms rc=7 (not running) + (11) stop: rc=0 (ok) failover-ip: migration-threshold=1000000 + (4) start: rc=0 (ok) + (6) monitor: interval=3000ms rc=0 (ok) + (12) stop: rc=0 (ok) kamailio:1: migration-threshold=2 last-failure='Thu Mar 3 11:50:37 2011' + (13) start: rc=0 (ok) + (14) monitor: interval=10000ms rc=0 (ok) ===================================================== armani: /etc/init.d/kamailio stop kamailio:0: fail-count=4, no migration... ----------------------------------------------------- Online: [ bulgari armani ] failover-ip (ocf::heartbeat:IPaddr): Started armani Clone Set: cloneKamailio Started: [ armani bulgari ] Operations: * Node armani: kamailio:1: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:27 2011' + (7) start: rc=0 (ok) + (8) monitor: interval=10000ms rc=7 (not running) + (9) stop: rc=0 (ok) failover-ip: migration-threshold=1000000 + (10) start: rc=0 (ok) + (12) monitor: interval=3000ms rc=0 (ok) kamailio:0: migration-threshold=2 fail-count=4 last-failure='Thu Mar 3 11:50:27 2011' + (23) stop: rc=0 (ok) + (24) start: rc=0 (ok) + (25) monitor: interval=10000ms rc=0 (ok) * Node bulgari: kamailio:0: migration-threshold=2 fail-count=2 last-failure='Thu Mar 3 11:50:37 2011' + (9) start: rc=0 (ok) + (10) monitor: interval=10000ms rc=7 (not running) + (11) stop: rc=0 (ok) failover-ip: migration-threshold=1000000 + (4) start: rc=0 (ok) + (6) monitor: interval=3000ms rc=0 (ok) + (12) stop: rc=0 (ok) kamailio:1: migration-threshold=2 last-failure='Thu Mar 3 11:50:37 2011' + (13) start: rc=0 (ok) + (14) monitor: interval=10000ms rc=0 (ok)
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker