Add test cases, where resource affinity rules are used with the static utilization scheduler and the rebalance on start option enabled. These verify the behavior in the following scenarios:
- 7 resources with interwined resource affinity rules in a 3 node cluster; 1 node failing - 3 resources in neg. affinity and a 3 node cluster, where the rules are stated in pairwise form; 1 node failing - 5 resources in neg. affinity and a 5 node cluster; nodes consecutively failing after each other Signed-off-by: Daniel Kral <d.k...@proxmox.com> --- .../README | 26 ++ .../cmdlist | 4 + .../datacenter.cfg | 6 + .../hardware_status | 5 + .../log.expect | 120 ++++++++ .../manager_status | 1 + .../rules_config | 19 ++ .../service_config | 10 + .../static_service_stats | 10 + .../README | 20 ++ .../cmdlist | 4 + .../datacenter.cfg | 6 + .../hardware_status | 5 + .../log.expect | 174 +++++++++++ .../manager_status | 1 + .../rules_config | 11 + .../service_config | 14 + .../static_service_stats | 14 + .../README | 22 ++ .../cmdlist | 22 ++ .../datacenter.cfg | 6 + .../hardware_status | 7 + .../log.expect | 272 ++++++++++++++++++ .../manager_status | 1 + .../rules_config | 3 + .../service_config | 9 + .../static_service_stats | 9 + 27 files changed, 801 insertions(+) create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/README create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/cmdlist create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/datacenter.cfg create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/hardware_status create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/log.expect create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/manager_status create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/rules_config create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/service_config create mode 100644 src/test/test-crs-static-rebalance-resource-affinity1/static_service_stats create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/README create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/cmdlist create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/datacenter.cfg create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/hardware_status create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/log.expect create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/manager_status create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/rules_config create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/service_config create mode 100644 src/test/test-crs-static-rebalance-resource-affinity2/static_service_stats create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/README create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/cmdlist create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/datacenter.cfg create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/hardware_status create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/log.expect create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/manager_status create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/rules_config create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/service_config create mode 100644 src/test/test-crs-static-rebalance-resource-affinity3/static_service_stats diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/README b/src/test/test-crs-static-rebalance-resource-affinity1/README new file mode 100644 index 0000000..9b36bf6 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity1/README @@ -0,0 +1,26 @@ +Test whether a mixed set of strict resource affinity rules in conjunction with +the static load scheduler with auto-rebalancing are applied correctly on +resource start enabled and in case of a subsequent failover. + +The test scenario is: +- vm:101 and vm:102 do not have any resource affinity +- Services that must be kept together: + - vm:102 and vm:107 + - vm:104, vm:106, and vm:108 +- Services that must be kept separate: + - vm:103, vm:104, and vm:105 + - vm:103, vm:106, and vm:107 + - vm:107 and vm:108 +- Therefore, there are consistent interdependencies between the positive and + negative resource affinity rules' resource members +- vm:101 and vm:102 are currently assigned to node1 and node2 respectively +- vm:103 through vm:108 are currently assigned to node3 + +The expected outcome is: +- vm:101, vm:102, vm:103 should be started on node1, node2, and node3 + respectively, as there's nothing running on there yet +- vm:104, vm:106, and vm:108 should all be assigned on the same node, which + will be node1, since it has the most resources left for vm:104 +- vm:105 and vm:107 should both be assigned on the same node, which will be + node2, since both cannot be assigned to the other nodes because of the + resource affinity constraints diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/cmdlist b/src/test/test-crs-static-rebalance-resource-affinity1/cmdlist new file mode 100644 index 0000000..eee0e40 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity1/cmdlist @@ -0,0 +1,4 @@ +[ + [ "power node1 on", "power node2 on", "power node3 on"], + [ "network node3 off" ] +] diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/datacenter.cfg b/src/test/test-crs-static-rebalance-resource-affinity1/datacenter.cfg new file mode 100644 index 0000000..f2671a5 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity1/datacenter.cfg @@ -0,0 +1,6 @@ +{ + "crs": { + "ha": "static", + "ha-rebalance-on-start": 1 + } +} diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/hardware_status b/src/test/test-crs-static-rebalance-resource-affinity1/hardware_status new file mode 100644 index 0000000..84484af --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity1/hardware_status @@ -0,0 +1,5 @@ +{ + "node1": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 }, + "node2": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 }, + "node3": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 } +} diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/log.expect b/src/test/test-crs-static-rebalance-resource-affinity1/log.expect new file mode 100644 index 0000000..cdd2497 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity1/log.expect @@ -0,0 +1,120 @@ +info 0 hardware: starting simulation +info 20 cmdlist: execute power node1 on +info 20 node1/crm: status change startup => wait_for_quorum +info 20 node1/lrm: status change startup => wait_for_agent_lock +info 20 cmdlist: execute power node2 on +info 20 node2/crm: status change startup => wait_for_quorum +info 20 node2/lrm: status change startup => wait_for_agent_lock +info 20 cmdlist: execute power node3 on +info 20 node3/crm: status change startup => wait_for_quorum +info 20 node3/lrm: status change startup => wait_for_agent_lock +info 20 node1/crm: got lock 'ha_manager_lock' +info 20 node1/crm: status change wait_for_quorum => master +info 20 node1/crm: using scheduler mode 'static' +info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online' +info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online' +info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online' +info 20 node1/crm: adding new service 'vm:101' on node 'node1' +info 20 node1/crm: adding new service 'vm:102' on node 'node2' +info 20 node1/crm: adding new service 'vm:103' on node 'node3' +info 20 node1/crm: adding new service 'vm:104' on node 'node3' +info 20 node1/crm: adding new service 'vm:105' on node 'node3' +info 20 node1/crm: adding new service 'vm:106' on node 'node3' +info 20 node1/crm: adding new service 'vm:107' on node 'node3' +info 20 node1/crm: adding new service 'vm:108' on node 'node3' +info 20 node1/crm: service vm:101: re-balance selected current node node1 for startup +info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1) +info 20 node1/crm: service vm:102: re-balance selected current node node2 for startup +info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node2) +info 20 node1/crm: service vm:103: re-balance selected current node node3 for startup +info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node3) +info 20 node1/crm: service vm:104: re-balance selected new node node1 for startup +info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'request_start_balance' (node = node3, target = node1) +info 20 node1/crm: service vm:105: re-balance selected new node node2 for startup +info 20 node1/crm: service 'vm:105': state changed from 'request_start' to 'request_start_balance' (node = node3, target = node2) +info 20 node1/crm: service vm:106: re-balance selected new node node1 for startup +info 20 node1/crm: service 'vm:106': state changed from 'request_start' to 'request_start_balance' (node = node3, target = node1) +info 20 node1/crm: service vm:107: re-balance selected new node node2 for startup +info 20 node1/crm: service 'vm:107': state changed from 'request_start' to 'request_start_balance' (node = node3, target = node2) +info 20 node1/crm: service vm:108: re-balance selected new node node1 for startup +info 20 node1/crm: service 'vm:108': state changed from 'request_start' to 'request_start_balance' (node = node3, target = node1) +info 21 node1/lrm: got lock 'ha_agent_node1_lock' +info 21 node1/lrm: status change wait_for_agent_lock => active +info 21 node1/lrm: starting service vm:101 +info 21 node1/lrm: service status vm:101 started +info 22 node2/crm: status change wait_for_quorum => slave +info 23 node2/lrm: got lock 'ha_agent_node2_lock' +info 23 node2/lrm: status change wait_for_agent_lock => active +info 23 node2/lrm: starting service vm:102 +info 23 node2/lrm: service status vm:102 started +info 24 node3/crm: status change wait_for_quorum => slave +info 25 node3/lrm: got lock 'ha_agent_node3_lock' +info 25 node3/lrm: status change wait_for_agent_lock => active +info 25 node3/lrm: starting service vm:103 +info 25 node3/lrm: service status vm:103 started +info 25 node3/lrm: service vm:104 - start relocate to node 'node1' +info 25 node3/lrm: service vm:104 - end relocate to node 'node1' +info 25 node3/lrm: service vm:105 - start relocate to node 'node2' +info 25 node3/lrm: service vm:105 - end relocate to node 'node2' +info 25 node3/lrm: service vm:106 - start relocate to node 'node1' +info 25 node3/lrm: service vm:106 - end relocate to node 'node1' +info 25 node3/lrm: service vm:107 - start relocate to node 'node2' +info 25 node3/lrm: service vm:107 - end relocate to node 'node2' +info 25 node3/lrm: service vm:108 - start relocate to node 'node1' +info 25 node3/lrm: service vm:108 - end relocate to node 'node1' +info 40 node1/crm: service 'vm:104': state changed from 'request_start_balance' to 'started' (node = node1) +info 40 node1/crm: service 'vm:105': state changed from 'request_start_balance' to 'started' (node = node2) +info 40 node1/crm: service 'vm:106': state changed from 'request_start_balance' to 'started' (node = node1) +info 40 node1/crm: service 'vm:107': state changed from 'request_start_balance' to 'started' (node = node2) +info 40 node1/crm: service 'vm:108': state changed from 'request_start_balance' to 'started' (node = node1) +info 41 node1/lrm: starting service vm:104 +info 41 node1/lrm: service status vm:104 started +info 41 node1/lrm: starting service vm:106 +info 41 node1/lrm: service status vm:106 started +info 41 node1/lrm: starting service vm:108 +info 41 node1/lrm: service status vm:108 started +info 43 node2/lrm: starting service vm:105 +info 43 node2/lrm: service status vm:105 started +info 43 node2/lrm: starting service vm:107 +info 43 node2/lrm: service status vm:107 started +info 120 cmdlist: execute network node3 off +info 120 node1/crm: node 'node3': state changed from 'online' => 'unknown' +info 124 node3/crm: status change slave => wait_for_quorum +info 125 node3/lrm: status change active => lost_agent_lock +info 160 node1/crm: service 'vm:103': state changed from 'started' to 'fence' +info 160 node1/crm: node 'node3': state changed from 'unknown' => 'fence' +emai 160 node1/crm: FENCE: Try to fence node 'node3' +info 166 watchdog: execute power node3 off +info 165 node3/crm: killed by poweroff +info 166 node3/lrm: killed by poweroff +info 166 hardware: server 'node3' stopped by poweroff (watchdog) +info 240 node1/crm: got lock 'ha_agent_node3_lock' +info 240 node1/crm: fencing: acknowledged - got agent lock for node 'node3' +info 240 node1/crm: node 'node3': state changed from 'fence' => 'unknown' +emai 240 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3' +info 240 node1/crm: service 'vm:103': state changed from 'fence' to 'recovery' +err 240 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 260 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 280 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 300 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 320 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 340 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 360 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 380 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 400 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 420 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 440 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 460 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 480 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 500 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 520 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 540 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 560 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 580 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 600 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 620 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 640 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 660 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 680 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +err 700 node1/crm: recovering service 'vm:103' from fenced node 'node3' failed, no recovery node found +info 720 hardware: exit simulation - done diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/manager_status b/src/test/test-crs-static-rebalance-resource-affinity1/manager_status new file mode 100644 index 0000000..9e26dfe --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity1/manager_status @@ -0,0 +1 @@ +{} \ No newline at end of file diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/rules_config b/src/test/test-crs-static-rebalance-resource-affinity1/rules_config new file mode 100644 index 0000000..734a055 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity1/rules_config @@ -0,0 +1,19 @@ +resource-affinity: vms-must-stick-together1 + resources vm:102,vm:107 + affinity positive + +resource-affinity: vms-must-stick-together2 + resources vm:104,vm:106,vm:108 + affinity positive + +resource-affinity: vms-must-stay-apart1 + resources vm:103,vm:104,vm:105 + affinity negative + +resource-affinity: vms-must-stay-apart2 + resources vm:103,vm:106,vm:107 + affinity negative + +resource-affinity: vms-must-stay-apart3 + resources vm:107,vm:108 + affinity negative diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/service_config b/src/test/test-crs-static-rebalance-resource-affinity1/service_config new file mode 100644 index 0000000..02e4a07 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity1/service_config @@ -0,0 +1,10 @@ +{ + "vm:101": { "node": "node1", "state": "started" }, + "vm:102": { "node": "node2", "state": "started" }, + "vm:103": { "node": "node3", "state": "started" }, + "vm:104": { "node": "node3", "state": "started" }, + "vm:105": { "node": "node3", "state": "started" }, + "vm:106": { "node": "node3", "state": "started" }, + "vm:107": { "node": "node3", "state": "started" }, + "vm:108": { "node": "node3", "state": "started" } +} diff --git a/src/test/test-crs-static-rebalance-resource-affinity1/static_service_stats b/src/test/test-crs-static-rebalance-resource-affinity1/static_service_stats new file mode 100644 index 0000000..c6472ca --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity1/static_service_stats @@ -0,0 +1,10 @@ +{ + "vm:101": { "maxcpu": 8, "maxmem": 16000000000 }, + "vm:102": { "maxcpu": 4, "maxmem": 24000000000 }, + "vm:103": { "maxcpu": 2, "maxmem": 32000000000 }, + "vm:104": { "maxcpu": 4, "maxmem": 48000000000 }, + "vm:105": { "maxcpu": 8, "maxmem": 16000000000 }, + "vm:106": { "maxcpu": 4, "maxmem": 32000000000 }, + "vm:107": { "maxcpu": 2, "maxmem": 64000000000 }, + "vm:108": { "maxcpu": 8, "maxmem": 48000000000 } +} diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/README b/src/test/test-crs-static-rebalance-resource-affinity2/README new file mode 100644 index 0000000..6354299 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity2/README @@ -0,0 +1,20 @@ +Test whether a pairwise strict negative resource affinity rules, i.e. negative +resource affinity relations a<->b, b<->c and a<->c, in conjunction with the +static load scheduler with auto-rebalancing are applied correctly on resource +start and in case of a subsequent failover. + +The test scenario is: +- vm:100 and vm:200 must be kept separate +- vm:200 and vm:300 must be kept separate +- vm:100 and vm:300 must be kept separate +- Therefore, vm:100, vm:200, and vm:300 must be kept separate +- The resources' static usage stats are chosen so that during rebalancing vm:300 + will need to select a less than ideal node according to the static usage + scheduler, i.e. node1 being the ideal one, to test whether the resource + affinity rule still applies correctly + +The expected outcome is: +- vm:100, vm:200, and vm:300 should be started on node1, node2, and node3 + respectively, just as if the three negative resource affinity rule would've + been stated in a single negative resource affinity rule +- As node3 fails, vm:300 cannot be recovered diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/cmdlist b/src/test/test-crs-static-rebalance-resource-affinity2/cmdlist new file mode 100644 index 0000000..eee0e40 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity2/cmdlist @@ -0,0 +1,4 @@ +[ + [ "power node1 on", "power node2 on", "power node3 on"], + [ "network node3 off" ] +] diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/datacenter.cfg b/src/test/test-crs-static-rebalance-resource-affinity2/datacenter.cfg new file mode 100644 index 0000000..f2671a5 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity2/datacenter.cfg @@ -0,0 +1,6 @@ +{ + "crs": { + "ha": "static", + "ha-rebalance-on-start": 1 + } +} diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/hardware_status b/src/test/test-crs-static-rebalance-resource-affinity2/hardware_status new file mode 100644 index 0000000..84484af --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity2/hardware_status @@ -0,0 +1,5 @@ +{ + "node1": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 }, + "node2": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 }, + "node3": { "power": "off", "network": "off", "cpus": 8, "memory": 112000000000 } +} diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/log.expect b/src/test/test-crs-static-rebalance-resource-affinity2/log.expect new file mode 100644 index 0000000..a7e5c8e --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity2/log.expect @@ -0,0 +1,174 @@ +info 0 hardware: starting simulation +info 20 cmdlist: execute power node1 on +info 20 node1/crm: status change startup => wait_for_quorum +info 20 node1/lrm: status change startup => wait_for_agent_lock +info 20 cmdlist: execute power node2 on +info 20 node2/crm: status change startup => wait_for_quorum +info 20 node2/lrm: status change startup => wait_for_agent_lock +info 20 cmdlist: execute power node3 on +info 20 node3/crm: status change startup => wait_for_quorum +info 20 node3/lrm: status change startup => wait_for_agent_lock +info 20 node1/crm: got lock 'ha_manager_lock' +info 20 node1/crm: status change wait_for_quorum => master +info 20 node1/crm: using scheduler mode 'static' +info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online' +info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online' +info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online' +info 20 node1/crm: adding new service 'vm:100' on node 'node1' +info 20 node1/crm: adding new service 'vm:101' on node 'node1' +info 20 node1/crm: adding new service 'vm:102' on node 'node1' +info 20 node1/crm: adding new service 'vm:103' on node 'node1' +info 20 node1/crm: adding new service 'vm:200' on node 'node1' +info 20 node1/crm: adding new service 'vm:201' on node 'node1' +info 20 node1/crm: adding new service 'vm:202' on node 'node1' +info 20 node1/crm: adding new service 'vm:203' on node 'node1' +info 20 node1/crm: adding new service 'vm:300' on node 'node1' +info 20 node1/crm: adding new service 'vm:301' on node 'node1' +info 20 node1/crm: adding new service 'vm:302' on node 'node1' +info 20 node1/crm: adding new service 'vm:303' on node 'node1' +info 20 node1/crm: service vm:100: re-balance selected current node node1 for startup +info 20 node1/crm: service 'vm:100': state changed from 'request_start' to 'started' (node = node1) +info 20 node1/crm: service vm:101: re-balance selected new node node2 for startup +info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node2) +info 20 node1/crm: service vm:102: re-balance selected new node node3 for startup +info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node3) +info 20 node1/crm: service vm:103: re-balance selected new node node3 for startup +info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node3) +info 20 node1/crm: service vm:200: re-balance selected new node node2 for startup +info 20 node1/crm: service 'vm:200': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node2) +info 20 node1/crm: service vm:201: re-balance selected new node node3 for startup +info 20 node1/crm: service 'vm:201': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node3) +info 20 node1/crm: service vm:202: re-balance selected new node node3 for startup +info 20 node1/crm: service 'vm:202': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node3) +info 20 node1/crm: service vm:203: re-balance selected current node node1 for startup +info 20 node1/crm: service 'vm:203': state changed from 'request_start' to 'started' (node = node1) +info 20 node1/crm: service vm:300: re-balance selected new node node3 for startup +info 20 node1/crm: service 'vm:300': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node3) +info 20 node1/crm: service vm:301: re-balance selected current node node1 for startup +info 20 node1/crm: service 'vm:301': state changed from 'request_start' to 'started' (node = node1) +info 20 node1/crm: service vm:302: re-balance selected new node node2 for startup +info 20 node1/crm: service 'vm:302': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node2) +info 20 node1/crm: service vm:303: re-balance selected current node node1 for startup +info 20 node1/crm: service 'vm:303': state changed from 'request_start' to 'started' (node = node1) +info 21 node1/lrm: got lock 'ha_agent_node1_lock' +info 21 node1/lrm: status change wait_for_agent_lock => active +info 21 node1/lrm: starting service vm:100 +info 21 node1/lrm: service status vm:100 started +info 21 node1/lrm: service vm:101 - start relocate to node 'node2' +info 21 node1/lrm: service vm:101 - end relocate to node 'node2' +info 21 node1/lrm: service vm:102 - start relocate to node 'node3' +info 21 node1/lrm: service vm:102 - end relocate to node 'node3' +info 21 node1/lrm: service vm:103 - start relocate to node 'node3' +info 21 node1/lrm: service vm:103 - end relocate to node 'node3' +info 21 node1/lrm: service vm:200 - start relocate to node 'node2' +info 21 node1/lrm: service vm:200 - end relocate to node 'node2' +info 21 node1/lrm: service vm:201 - start relocate to node 'node3' +info 21 node1/lrm: service vm:201 - end relocate to node 'node3' +info 21 node1/lrm: service vm:202 - start relocate to node 'node3' +info 21 node1/lrm: service vm:202 - end relocate to node 'node3' +info 21 node1/lrm: starting service vm:203 +info 21 node1/lrm: service status vm:203 started +info 21 node1/lrm: service vm:300 - start relocate to node 'node3' +info 21 node1/lrm: service vm:300 - end relocate to node 'node3' +info 21 node1/lrm: starting service vm:301 +info 21 node1/lrm: service status vm:301 started +info 21 node1/lrm: service vm:302 - start relocate to node 'node2' +info 21 node1/lrm: service vm:302 - end relocate to node 'node2' +info 21 node1/lrm: starting service vm:303 +info 21 node1/lrm: service status vm:303 started +info 22 node2/crm: status change wait_for_quorum => slave +info 24 node3/crm: status change wait_for_quorum => slave +info 40 node1/crm: service 'vm:101': state changed from 'request_start_balance' to 'started' (node = node2) +info 40 node1/crm: service 'vm:102': state changed from 'request_start_balance' to 'started' (node = node3) +info 40 node1/crm: service 'vm:103': state changed from 'request_start_balance' to 'started' (node = node3) +info 40 node1/crm: service 'vm:200': state changed from 'request_start_balance' to 'started' (node = node2) +info 40 node1/crm: service 'vm:201': state changed from 'request_start_balance' to 'started' (node = node3) +info 40 node1/crm: service 'vm:202': state changed from 'request_start_balance' to 'started' (node = node3) +info 40 node1/crm: service 'vm:300': state changed from 'request_start_balance' to 'started' (node = node3) +info 40 node1/crm: service 'vm:302': state changed from 'request_start_balance' to 'started' (node = node2) +info 43 node2/lrm: got lock 'ha_agent_node2_lock' +info 43 node2/lrm: status change wait_for_agent_lock => active +info 43 node2/lrm: starting service vm:101 +info 43 node2/lrm: service status vm:101 started +info 43 node2/lrm: starting service vm:200 +info 43 node2/lrm: service status vm:200 started +info 43 node2/lrm: starting service vm:302 +info 43 node2/lrm: service status vm:302 started +info 45 node3/lrm: got lock 'ha_agent_node3_lock' +info 45 node3/lrm: status change wait_for_agent_lock => active +info 45 node3/lrm: starting service vm:102 +info 45 node3/lrm: service status vm:102 started +info 45 node3/lrm: starting service vm:103 +info 45 node3/lrm: service status vm:103 started +info 45 node3/lrm: starting service vm:201 +info 45 node3/lrm: service status vm:201 started +info 45 node3/lrm: starting service vm:202 +info 45 node3/lrm: service status vm:202 started +info 45 node3/lrm: starting service vm:300 +info 45 node3/lrm: service status vm:300 started +info 120 cmdlist: execute network node3 off +info 120 node1/crm: node 'node3': state changed from 'online' => 'unknown' +info 124 node3/crm: status change slave => wait_for_quorum +info 125 node3/lrm: status change active => lost_agent_lock +info 160 node1/crm: service 'vm:102': state changed from 'started' to 'fence' +info 160 node1/crm: service 'vm:103': state changed from 'started' to 'fence' +info 160 node1/crm: service 'vm:201': state changed from 'started' to 'fence' +info 160 node1/crm: service 'vm:202': state changed from 'started' to 'fence' +info 160 node1/crm: service 'vm:300': state changed from 'started' to 'fence' +info 160 node1/crm: node 'node3': state changed from 'unknown' => 'fence' +emai 160 node1/crm: FENCE: Try to fence node 'node3' +info 166 watchdog: execute power node3 off +info 165 node3/crm: killed by poweroff +info 166 node3/lrm: killed by poweroff +info 166 hardware: server 'node3' stopped by poweroff (watchdog) +info 240 node1/crm: got lock 'ha_agent_node3_lock' +info 240 node1/crm: fencing: acknowledged - got agent lock for node 'node3' +info 240 node1/crm: node 'node3': state changed from 'fence' => 'unknown' +emai 240 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3' +info 240 node1/crm: service 'vm:102': state changed from 'fence' to 'recovery' +info 240 node1/crm: service 'vm:103': state changed from 'fence' to 'recovery' +info 240 node1/crm: service 'vm:201': state changed from 'fence' to 'recovery' +info 240 node1/crm: service 'vm:202': state changed from 'fence' to 'recovery' +info 240 node1/crm: service 'vm:300': state changed from 'fence' to 'recovery' +info 240 node1/crm: recover service 'vm:102' from fenced node 'node3' to node 'node1' +info 240 node1/crm: service 'vm:102': state changed from 'recovery' to 'started' (node = node1) +info 240 node1/crm: recover service 'vm:103' from fenced node 'node3' to node 'node2' +info 240 node1/crm: service 'vm:103': state changed from 'recovery' to 'started' (node = node2) +info 240 node1/crm: recover service 'vm:201' from fenced node 'node3' to node 'node2' +info 240 node1/crm: service 'vm:201': state changed from 'recovery' to 'started' (node = node2) +info 240 node1/crm: recover service 'vm:202' from fenced node 'node3' to node 'node2' +info 240 node1/crm: service 'vm:202': state changed from 'recovery' to 'started' (node = node2) +err 240 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 240 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +info 241 node1/lrm: starting service vm:102 +info 241 node1/lrm: service status vm:102 started +info 243 node2/lrm: starting service vm:103 +info 243 node2/lrm: service status vm:103 started +info 243 node2/lrm: starting service vm:201 +info 243 node2/lrm: service status vm:201 started +info 243 node2/lrm: starting service vm:202 +info 243 node2/lrm: service status vm:202 started +err 260 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 280 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 300 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 320 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 340 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 360 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 380 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 400 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 420 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 440 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 460 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 480 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 500 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 520 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 540 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 560 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 580 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 600 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 620 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 640 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 660 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 680 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +err 700 node1/crm: recovering service 'vm:300' from fenced node 'node3' failed, no recovery node found +info 720 hardware: exit simulation - done diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/manager_status b/src/test/test-crs-static-rebalance-resource-affinity2/manager_status new file mode 100644 index 0000000..9e26dfe --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity2/manager_status @@ -0,0 +1 @@ +{} \ No newline at end of file diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/rules_config b/src/test/test-crs-static-rebalance-resource-affinity2/rules_config new file mode 100644 index 0000000..bfe8787 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity2/rules_config @@ -0,0 +1,11 @@ +resource-affinity: very-lonely-services1 + resources vm:100,vm:200 + affinity negative + +resource-affinity: very-lonely-services2 + resources vm:200,vm:300 + affinity negative + +resource-affinity: very-lonely-services3 + resources vm:100,vm:300 + affinity negative diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/service_config b/src/test/test-crs-static-rebalance-resource-affinity2/service_config new file mode 100644 index 0000000..0de367e --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity2/service_config @@ -0,0 +1,14 @@ +{ + "vm:100": { "node": "node1", "state": "started" }, + "vm:101": { "node": "node1", "state": "started" }, + "vm:102": { "node": "node1", "state": "started" }, + "vm:103": { "node": "node1", "state": "started" }, + "vm:200": { "node": "node1", "state": "started" }, + "vm:201": { "node": "node1", "state": "started" }, + "vm:202": { "node": "node1", "state": "started" }, + "vm:203": { "node": "node1", "state": "started" }, + "vm:300": { "node": "node1", "state": "started" }, + "vm:301": { "node": "node1", "state": "started" }, + "vm:302": { "node": "node1", "state": "started" }, + "vm:303": { "node": "node1", "state": "started" } +} diff --git a/src/test/test-crs-static-rebalance-resource-affinity2/static_service_stats b/src/test/test-crs-static-rebalance-resource-affinity2/static_service_stats new file mode 100644 index 0000000..3c7502e --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity2/static_service_stats @@ -0,0 +1,14 @@ +{ + "vm:100": { "maxcpu": 8, "maxmem": 16000000000 }, + "vm:101": { "maxcpu": 4, "maxmem": 8000000000 }, + "vm:102": { "maxcpu": 2, "maxmem": 8000000000 }, + "vm:103": { "maxcpu": 2, "maxmem": 4000000000 }, + "vm:200": { "maxcpu": 4, "maxmem": 24000000000 }, + "vm:201": { "maxcpu": 2, "maxmem": 8000000000 }, + "vm:202": { "maxcpu": 4, "maxmem": 4000000000 }, + "vm:203": { "maxcpu": 2, "maxmem": 8000000000 }, + "vm:300": { "maxcpu": 6, "maxmem": 32000000000 }, + "vm:301": { "maxcpu": 2, "maxmem": 4000000000 }, + "vm:302": { "maxcpu": 2, "maxmem": 8000000000 }, + "vm:303": { "maxcpu": 4, "maxmem": 8000000000 } +} diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/README b/src/test/test-crs-static-rebalance-resource-affinity3/README new file mode 100644 index 0000000..9e57662 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity3/README @@ -0,0 +1,22 @@ +Test whether a more complex set of pairwise strict negative resource affinity +rules, i.e. there's negative resource affinity relations a<->b, b<->c and a<->c, +with 5 resources in conjunction with the static load scheduler with +auto-rebalancing are applied correctly on resource start and in case of a +consecutive failover of all nodes after each other. + +The test scenario is: +- vm:100, vm:200, vm:300, vm:400, and vm:500 must be kept separate +- The resources' static usage stats are chosen so that during rebalancing vm:300 + and vm:500 will need to select a less than ideal node according to the static + usage scheduler, i.e. node2 and node3 being their ideal ones, to test whether + the resource affinity rule still applies correctly + +The expected outcome is: +- vm:100, vm:200, vm:300, vm:400, and vm:500 should be started on node2, node1, + node4, node3, and node5 respectively +- vm:400 and vm:500 are started on node3 and node5, instead of node2 and node3 + as would've been without the resource affinity rules +- As node1, node2, node3, node4, and node5 fail consecutively with each node + coming back online, vm:200, vm:100, vm:400, vm:300, and vm:500 will be put in + recovery during the failover respectively, as there is no other node left to + accomodate them without violating the resource affinity rule. diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/cmdlist b/src/test/test-crs-static-rebalance-resource-affinity3/cmdlist new file mode 100644 index 0000000..6665419 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity3/cmdlist @@ -0,0 +1,22 @@ +[ + [ "power node1 on", "power node2 on", "power node3 on", "power node4 on", "power node5 on" ], + [ "power node1 off" ], + [ "delay 100" ], + [ "power node1 on" ], + [ "delay 100" ], + [ "power node2 off" ], + [ "delay 100" ], + [ "power node2 on" ], + [ "delay 100" ], + [ "power node3 off" ], + [ "delay 100" ], + [ "power node3 on" ], + [ "delay 100" ], + [ "power node4 off" ], + [ "delay 100" ], + [ "power node4 on" ], + [ "delay 100" ], + [ "power node5 off" ], + [ "delay 100" ], + [ "power node5 on" ] +] diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/datacenter.cfg b/src/test/test-crs-static-rebalance-resource-affinity3/datacenter.cfg new file mode 100644 index 0000000..f2671a5 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity3/datacenter.cfg @@ -0,0 +1,6 @@ +{ + "crs": { + "ha": "static", + "ha-rebalance-on-start": 1 + } +} diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/hardware_status b/src/test/test-crs-static-rebalance-resource-affinity3/hardware_status new file mode 100644 index 0000000..b6dcb1a --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity3/hardware_status @@ -0,0 +1,7 @@ +{ + "node1": { "power": "off", "network": "off", "cpus": 8, "memory": 48000000000 }, + "node2": { "power": "off", "network": "off", "cpus": 32, "memory": 36000000000 }, + "node3": { "power": "off", "network": "off", "cpus": 16, "memory": 24000000000 }, + "node4": { "power": "off", "network": "off", "cpus": 32, "memory": 36000000000 }, + "node5": { "power": "off", "network": "off", "cpus": 8, "memory": 48000000000 } +} diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/log.expect b/src/test/test-crs-static-rebalance-resource-affinity3/log.expect new file mode 100644 index 0000000..4e87f03 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity3/log.expect @@ -0,0 +1,272 @@ +info 0 hardware: starting simulation +info 20 cmdlist: execute power node1 on +info 20 node1/crm: status change startup => wait_for_quorum +info 20 node1/lrm: status change startup => wait_for_agent_lock +info 20 cmdlist: execute power node2 on +info 20 node2/crm: status change startup => wait_for_quorum +info 20 node2/lrm: status change startup => wait_for_agent_lock +info 20 cmdlist: execute power node3 on +info 20 node3/crm: status change startup => wait_for_quorum +info 20 node3/lrm: status change startup => wait_for_agent_lock +info 20 cmdlist: execute power node4 on +info 20 node4/crm: status change startup => wait_for_quorum +info 20 node4/lrm: status change startup => wait_for_agent_lock +info 20 cmdlist: execute power node5 on +info 20 node5/crm: status change startup => wait_for_quorum +info 20 node5/lrm: status change startup => wait_for_agent_lock +info 20 node1/crm: got lock 'ha_manager_lock' +info 20 node1/crm: status change wait_for_quorum => master +info 20 node1/crm: using scheduler mode 'static' +info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online' +info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online' +info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online' +info 20 node1/crm: node 'node4': state changed from 'unknown' => 'online' +info 20 node1/crm: node 'node5': state changed from 'unknown' => 'online' +info 20 node1/crm: adding new service 'vm:100' on node 'node1' +info 20 node1/crm: adding new service 'vm:101' on node 'node1' +info 20 node1/crm: adding new service 'vm:200' on node 'node1' +info 20 node1/crm: adding new service 'vm:201' on node 'node1' +info 20 node1/crm: adding new service 'vm:300' on node 'node1' +info 20 node1/crm: adding new service 'vm:400' on node 'node1' +info 20 node1/crm: adding new service 'vm:500' on node 'node1' +info 20 node1/crm: service vm:100: re-balance selected new node node2 for startup +info 20 node1/crm: service 'vm:100': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node2) +info 20 node1/crm: service vm:101: re-balance selected new node node4 for startup +info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node4) +info 20 node1/crm: service vm:200: re-balance selected current node node1 for startup +info 20 node1/crm: service 'vm:200': state changed from 'request_start' to 'started' (node = node1) +info 20 node1/crm: service vm:201: re-balance selected new node node5 for startup +info 20 node1/crm: service 'vm:201': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node5) +info 20 node1/crm: service vm:300: re-balance selected new node node4 for startup +info 20 node1/crm: service 'vm:300': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node4) +info 20 node1/crm: service vm:400: re-balance selected new node node3 for startup +info 20 node1/crm: service 'vm:400': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node3) +info 20 node1/crm: service vm:500: re-balance selected new node node5 for startup +info 20 node1/crm: service 'vm:500': state changed from 'request_start' to 'request_start_balance' (node = node1, target = node5) +info 21 node1/lrm: got lock 'ha_agent_node1_lock' +info 21 node1/lrm: status change wait_for_agent_lock => active +info 21 node1/lrm: service vm:100 - start relocate to node 'node2' +info 21 node1/lrm: service vm:100 - end relocate to node 'node2' +info 21 node1/lrm: service vm:101 - start relocate to node 'node4' +info 21 node1/lrm: service vm:101 - end relocate to node 'node4' +info 21 node1/lrm: starting service vm:200 +info 21 node1/lrm: service status vm:200 started +info 21 node1/lrm: service vm:201 - start relocate to node 'node5' +info 21 node1/lrm: service vm:201 - end relocate to node 'node5' +info 21 node1/lrm: service vm:300 - start relocate to node 'node4' +info 21 node1/lrm: service vm:300 - end relocate to node 'node4' +info 21 node1/lrm: service vm:400 - start relocate to node 'node3' +info 21 node1/lrm: service vm:400 - end relocate to node 'node3' +info 21 node1/lrm: service vm:500 - start relocate to node 'node5' +info 21 node1/lrm: service vm:500 - end relocate to node 'node5' +info 22 node2/crm: status change wait_for_quorum => slave +info 24 node3/crm: status change wait_for_quorum => slave +info 26 node4/crm: status change wait_for_quorum => slave +info 28 node5/crm: status change wait_for_quorum => slave +info 40 node1/crm: service 'vm:100': state changed from 'request_start_balance' to 'started' (node = node2) +info 40 node1/crm: service 'vm:101': state changed from 'request_start_balance' to 'started' (node = node4) +info 40 node1/crm: service 'vm:201': state changed from 'request_start_balance' to 'started' (node = node5) +info 40 node1/crm: service 'vm:300': state changed from 'request_start_balance' to 'started' (node = node4) +info 40 node1/crm: service 'vm:400': state changed from 'request_start_balance' to 'started' (node = node3) +info 40 node1/crm: service 'vm:500': state changed from 'request_start_balance' to 'started' (node = node5) +info 43 node2/lrm: got lock 'ha_agent_node2_lock' +info 43 node2/lrm: status change wait_for_agent_lock => active +info 43 node2/lrm: starting service vm:100 +info 43 node2/lrm: service status vm:100 started +info 45 node3/lrm: got lock 'ha_agent_node3_lock' +info 45 node3/lrm: status change wait_for_agent_lock => active +info 45 node3/lrm: starting service vm:400 +info 45 node3/lrm: service status vm:400 started +info 47 node4/lrm: got lock 'ha_agent_node4_lock' +info 47 node4/lrm: status change wait_for_agent_lock => active +info 47 node4/lrm: starting service vm:101 +info 47 node4/lrm: service status vm:101 started +info 47 node4/lrm: starting service vm:300 +info 47 node4/lrm: service status vm:300 started +info 49 node5/lrm: got lock 'ha_agent_node5_lock' +info 49 node5/lrm: status change wait_for_agent_lock => active +info 49 node5/lrm: starting service vm:201 +info 49 node5/lrm: service status vm:201 started +info 49 node5/lrm: starting service vm:500 +info 49 node5/lrm: service status vm:500 started +info 120 cmdlist: execute power node1 off +info 120 node1/crm: killed by poweroff +info 120 node1/lrm: killed by poweroff +info 220 cmdlist: execute delay 100 +info 222 node3/crm: got lock 'ha_manager_lock' +info 222 node3/crm: status change slave => master +info 222 node3/crm: using scheduler mode 'static' +info 222 node3/crm: node 'node1': state changed from 'online' => 'unknown' +info 282 node3/crm: service 'vm:200': state changed from 'started' to 'fence' +info 282 node3/crm: node 'node1': state changed from 'unknown' => 'fence' +emai 282 node3/crm: FENCE: Try to fence node 'node1' +info 282 node3/crm: got lock 'ha_agent_node1_lock' +info 282 node3/crm: fencing: acknowledged - got agent lock for node 'node1' +info 282 node3/crm: node 'node1': state changed from 'fence' => 'unknown' +emai 282 node3/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node1' +info 282 node3/crm: service 'vm:200': state changed from 'fence' to 'recovery' +err 282 node3/crm: recovering service 'vm:200' from fenced node 'node1' failed, no recovery node found +err 302 node3/crm: recovering service 'vm:200' from fenced node 'node1' failed, no recovery node found +err 322 node3/crm: recovering service 'vm:200' from fenced node 'node1' failed, no recovery node found +err 342 node3/crm: recovering service 'vm:200' from fenced node 'node1' failed, no recovery node found +err 362 node3/crm: recovering service 'vm:200' from fenced node 'node1' failed, no recovery node found +err 382 node3/crm: recovering service 'vm:200' from fenced node 'node1' failed, no recovery node found +info 400 cmdlist: execute power node1 on +info 400 node1/crm: status change startup => wait_for_quorum +info 400 node1/lrm: status change startup => wait_for_agent_lock +info 400 node1/crm: status change wait_for_quorum => slave +info 404 node3/crm: node 'node1': state changed from 'unknown' => 'online' +info 404 node3/crm: recover service 'vm:200' to previous failed and fenced node 'node1' again +info 404 node3/crm: service 'vm:200': state changed from 'recovery' to 'started' (node = node1) +info 421 node1/lrm: got lock 'ha_agent_node1_lock' +info 421 node1/lrm: status change wait_for_agent_lock => active +info 421 node1/lrm: starting service vm:200 +info 421 node1/lrm: service status vm:200 started +info 500 cmdlist: execute delay 100 +info 680 cmdlist: execute power node2 off +info 680 node2/crm: killed by poweroff +info 680 node2/lrm: killed by poweroff +info 682 node3/crm: node 'node2': state changed from 'online' => 'unknown' +info 742 node3/crm: service 'vm:100': state changed from 'started' to 'fence' +info 742 node3/crm: node 'node2': state changed from 'unknown' => 'fence' +emai 742 node3/crm: FENCE: Try to fence node 'node2' +info 780 cmdlist: execute delay 100 +info 802 node3/crm: got lock 'ha_agent_node2_lock' +info 802 node3/crm: fencing: acknowledged - got agent lock for node 'node2' +info 802 node3/crm: node 'node2': state changed from 'fence' => 'unknown' +emai 802 node3/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node2' +info 802 node3/crm: service 'vm:100': state changed from 'fence' to 'recovery' +err 802 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found +err 822 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found +err 842 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found +err 862 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found +err 882 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found +err 902 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found +err 922 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found +err 942 node3/crm: recovering service 'vm:100' from fenced node 'node2' failed, no recovery node found +info 960 cmdlist: execute power node2 on +info 960 node2/crm: status change startup => wait_for_quorum +info 960 node2/lrm: status change startup => wait_for_agent_lock +info 962 node2/crm: status change wait_for_quorum => slave +info 963 node2/lrm: got lock 'ha_agent_node2_lock' +info 963 node2/lrm: status change wait_for_agent_lock => active +info 964 node3/crm: node 'node2': state changed from 'unknown' => 'online' +info 964 node3/crm: recover service 'vm:100' to previous failed and fenced node 'node2' again +info 964 node3/crm: service 'vm:100': state changed from 'recovery' to 'started' (node = node2) +info 983 node2/lrm: starting service vm:100 +info 983 node2/lrm: service status vm:100 started +info 1060 cmdlist: execute delay 100 +info 1240 cmdlist: execute power node3 off +info 1240 node3/crm: killed by poweroff +info 1240 node3/lrm: killed by poweroff +info 1340 cmdlist: execute delay 100 +info 1346 node5/crm: got lock 'ha_manager_lock' +info 1346 node5/crm: status change slave => master +info 1346 node5/crm: using scheduler mode 'static' +info 1346 node5/crm: node 'node3': state changed from 'online' => 'unknown' +info 1406 node5/crm: service 'vm:400': state changed from 'started' to 'fence' +info 1406 node5/crm: node 'node3': state changed from 'unknown' => 'fence' +emai 1406 node5/crm: FENCE: Try to fence node 'node3' +info 1406 node5/crm: got lock 'ha_agent_node3_lock' +info 1406 node5/crm: fencing: acknowledged - got agent lock for node 'node3' +info 1406 node5/crm: node 'node3': state changed from 'fence' => 'unknown' +emai 1406 node5/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node3' +info 1406 node5/crm: service 'vm:400': state changed from 'fence' to 'recovery' +err 1406 node5/crm: recovering service 'vm:400' from fenced node 'node3' failed, no recovery node found +err 1426 node5/crm: recovering service 'vm:400' from fenced node 'node3' failed, no recovery node found +err 1446 node5/crm: recovering service 'vm:400' from fenced node 'node3' failed, no recovery node found +err 1466 node5/crm: recovering service 'vm:400' from fenced node 'node3' failed, no recovery node found +err 1486 node5/crm: recovering service 'vm:400' from fenced node 'node3' failed, no recovery node found +err 1506 node5/crm: recovering service 'vm:400' from fenced node 'node3' failed, no recovery node found +info 1520 cmdlist: execute power node3 on +info 1520 node3/crm: status change startup => wait_for_quorum +info 1520 node3/lrm: status change startup => wait_for_agent_lock +info 1524 node3/crm: status change wait_for_quorum => slave +info 1528 node5/crm: node 'node3': state changed from 'unknown' => 'online' +info 1528 node5/crm: recover service 'vm:400' to previous failed and fenced node 'node3' again +info 1528 node5/crm: service 'vm:400': state changed from 'recovery' to 'started' (node = node3) +info 1545 node3/lrm: got lock 'ha_agent_node3_lock' +info 1545 node3/lrm: status change wait_for_agent_lock => active +info 1545 node3/lrm: starting service vm:400 +info 1545 node3/lrm: service status vm:400 started +info 1620 cmdlist: execute delay 100 +info 1800 cmdlist: execute power node4 off +info 1800 node4/crm: killed by poweroff +info 1800 node4/lrm: killed by poweroff +info 1806 node5/crm: node 'node4': state changed from 'online' => 'unknown' +info 1866 node5/crm: service 'vm:101': state changed from 'started' to 'fence' +info 1866 node5/crm: service 'vm:300': state changed from 'started' to 'fence' +info 1866 node5/crm: node 'node4': state changed from 'unknown' => 'fence' +emai 1866 node5/crm: FENCE: Try to fence node 'node4' +info 1900 cmdlist: execute delay 100 +info 1926 node5/crm: got lock 'ha_agent_node4_lock' +info 1926 node5/crm: fencing: acknowledged - got agent lock for node 'node4' +info 1926 node5/crm: node 'node4': state changed from 'fence' => 'unknown' +emai 1926 node5/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node4' +info 1926 node5/crm: service 'vm:101': state changed from 'fence' to 'recovery' +info 1926 node5/crm: service 'vm:300': state changed from 'fence' to 'recovery' +info 1926 node5/crm: recover service 'vm:101' from fenced node 'node4' to node 'node2' +info 1926 node5/crm: service 'vm:101': state changed from 'recovery' to 'started' (node = node2) +err 1926 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found +err 1926 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found +info 1943 node2/lrm: starting service vm:101 +info 1943 node2/lrm: service status vm:101 started +err 1946 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found +err 1966 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found +err 1986 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found +err 2006 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found +err 2026 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found +err 2046 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found +err 2066 node5/crm: recovering service 'vm:300' from fenced node 'node4' failed, no recovery node found +info 2080 cmdlist: execute power node4 on +info 2080 node4/crm: status change startup => wait_for_quorum +info 2080 node4/lrm: status change startup => wait_for_agent_lock +info 2086 node4/crm: status change wait_for_quorum => slave +info 2087 node4/lrm: got lock 'ha_agent_node4_lock' +info 2087 node4/lrm: status change wait_for_agent_lock => active +info 2088 node5/crm: node 'node4': state changed from 'unknown' => 'online' +info 2088 node5/crm: recover service 'vm:300' to previous failed and fenced node 'node4' again +info 2088 node5/crm: service 'vm:300': state changed from 'recovery' to 'started' (node = node4) +info 2107 node4/lrm: starting service vm:300 +info 2107 node4/lrm: service status vm:300 started +info 2180 cmdlist: execute delay 100 +info 2360 cmdlist: execute power node5 off +info 2360 node5/crm: killed by poweroff +info 2360 node5/lrm: killed by poweroff +info 2460 cmdlist: execute delay 100 +info 2480 node1/crm: got lock 'ha_manager_lock' +info 2480 node1/crm: status change slave => master +info 2480 node1/crm: using scheduler mode 'static' +info 2480 node1/crm: node 'node5': state changed from 'online' => 'unknown' +info 2540 node1/crm: service 'vm:201': state changed from 'started' to 'fence' +info 2540 node1/crm: service 'vm:500': state changed from 'started' to 'fence' +info 2540 node1/crm: node 'node5': state changed from 'unknown' => 'fence' +emai 2540 node1/crm: FENCE: Try to fence node 'node5' +info 2540 node1/crm: got lock 'ha_agent_node5_lock' +info 2540 node1/crm: fencing: acknowledged - got agent lock for node 'node5' +info 2540 node1/crm: node 'node5': state changed from 'fence' => 'unknown' +emai 2540 node1/crm: SUCCEED: fencing: acknowledged - got agent lock for node 'node5' +info 2540 node1/crm: service 'vm:201': state changed from 'fence' to 'recovery' +info 2540 node1/crm: service 'vm:500': state changed from 'fence' to 'recovery' +info 2540 node1/crm: recover service 'vm:201' from fenced node 'node5' to node 'node2' +info 2540 node1/crm: service 'vm:201': state changed from 'recovery' to 'started' (node = node2) +err 2540 node1/crm: recovering service 'vm:500' from fenced node 'node5' failed, no recovery node found +err 2540 node1/crm: recovering service 'vm:500' from fenced node 'node5' failed, no recovery node found +info 2543 node2/lrm: starting service vm:201 +info 2543 node2/lrm: service status vm:201 started +err 2560 node1/crm: recovering service 'vm:500' from fenced node 'node5' failed, no recovery node found +err 2580 node1/crm: recovering service 'vm:500' from fenced node 'node5' failed, no recovery node found +err 2600 node1/crm: recovering service 'vm:500' from fenced node 'node5' failed, no recovery node found +err 2620 node1/crm: recovering service 'vm:500' from fenced node 'node5' failed, no recovery node found +info 2640 cmdlist: execute power node5 on +info 2640 node5/crm: status change startup => wait_for_quorum +info 2640 node5/lrm: status change startup => wait_for_agent_lock +info 2640 node1/crm: node 'node5': state changed from 'unknown' => 'online' +info 2640 node1/crm: recover service 'vm:500' to previous failed and fenced node 'node5' again +info 2640 node1/crm: service 'vm:500': state changed from 'recovery' to 'started' (node = node5) +info 2648 node5/crm: status change wait_for_quorum => slave +info 2669 node5/lrm: got lock 'ha_agent_node5_lock' +info 2669 node5/lrm: status change wait_for_agent_lock => active +info 2669 node5/lrm: starting service vm:500 +info 2669 node5/lrm: service status vm:500 started +info 3240 hardware: exit simulation - done diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/manager_status b/src/test/test-crs-static-rebalance-resource-affinity3/manager_status new file mode 100644 index 0000000..9e26dfe --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity3/manager_status @@ -0,0 +1 @@ +{} \ No newline at end of file diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/rules_config b/src/test/test-crs-static-rebalance-resource-affinity3/rules_config new file mode 100644 index 0000000..442cd58 --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity3/rules_config @@ -0,0 +1,3 @@ +resource-affinity: keep-them-apart + resources vm:100,vm:200,vm:300,vm:400,vm:500 + affinity negative diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/service_config b/src/test/test-crs-static-rebalance-resource-affinity3/service_config new file mode 100644 index 0000000..86dc27d --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity3/service_config @@ -0,0 +1,9 @@ +{ + "vm:100": { "node": "node1", "state": "started" }, + "vm:101": { "node": "node1", "state": "started" }, + "vm:200": { "node": "node1", "state": "started" }, + "vm:201": { "node": "node1", "state": "started" }, + "vm:300": { "node": "node1", "state": "started" }, + "vm:400": { "node": "node1", "state": "started" }, + "vm:500": { "node": "node1", "state": "started" } +} diff --git a/src/test/test-crs-static-rebalance-resource-affinity3/static_service_stats b/src/test/test-crs-static-rebalance-resource-affinity3/static_service_stats new file mode 100644 index 0000000..755282b --- /dev/null +++ b/src/test/test-crs-static-rebalance-resource-affinity3/static_service_stats @@ -0,0 +1,9 @@ +{ + "vm:100": { "maxcpu": 16, "maxmem": 16000000000 }, + "vm:101": { "maxcpu": 4, "maxmem": 8000000000 }, + "vm:200": { "maxcpu": 2, "maxmem": 48000000000 }, + "vm:201": { "maxcpu": 4, "maxmem": 8000000000 }, + "vm:300": { "maxcpu": 8, "maxmem": 32000000000 }, + "vm:400": { "maxcpu": 32, "maxmem": 32000000000 }, + "vm:500": { "maxcpu": 16, "maxmem": 8000000000 } +} -- 2.39.5 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel