Add documentation about HA Resource Affinity rules, what effects those have on the CRS scheduler, and what users can expect when those are changed.
There are also a few points on the rule conflicts/errors list which describe some conflicts that can arise from a mixed usage of HA Node Affinity rules and HA Resource Affinity rules. Signed-off-by: Daniel Kral <d.k...@proxmox.com> --- Makefile | 1 + gen-ha-rules-resource-affinity-opts.pl | 20 ++++ ha-manager.adoc | 133 +++++++++++++++++++++++++ ha-rules-resource-affinity-opts.adoc | 8 ++ 4 files changed, 162 insertions(+) create mode 100755 gen-ha-rules-resource-affinity-opts.pl create mode 100644 ha-rules-resource-affinity-opts.adoc diff --git a/Makefile b/Makefile index c5e506e..4d9e2f0 100644 --- a/Makefile +++ b/Makefile @@ -51,6 +51,7 @@ GEN_SCRIPTS= \ gen-ha-resources-opts.pl \ gen-ha-rules-node-affinity-opts.pl \ gen-ha-rules-opts.pl \ + gen-ha-rules-resource-affinity-opts.pl \ gen-datacenter.cfg.5-opts.pl \ gen-pct.conf.5-opts.pl \ gen-pct-network-opts.pl \ diff --git a/gen-ha-rules-resource-affinity-opts.pl b/gen-ha-rules-resource-affinity-opts.pl new file mode 100755 index 0000000..5abed50 --- /dev/null +++ b/gen-ha-rules-resource-affinity-opts.pl @@ -0,0 +1,20 @@ +#!/usr/bin/perl + +use lib '.'; +use strict; +use warnings; +use PVE::RESTHandler; + +use Data::Dumper; + +use PVE::HA::Rules; +use PVE::HA::Rules::ResourceAffinity; + +my $private = PVE::HA::Rules::private(); +my $resource_affinity_rule_props = PVE::HA::Rules::ResourceAffinity::properties(); +my $properties = { + resources => $private->{propertyList}->{resources}, + $resource_affinity_rule_props->%*, +}; + +print PVE::RESTHandler::dump_properties($properties); diff --git a/ha-manager.adoc b/ha-manager.adoc index ec26c22..8d06885 100644 --- a/ha-manager.adoc +++ b/ha-manager.adoc @@ -692,6 +692,10 @@ include::ha-rules-opts.adoc[] | HA Rule Type | Description | `node-affinity` | Places affinity from one or more HA resources to one or more nodes. +| `resource-affinity` | Places affinity between two or more HA resources. The +affinity `separate` specifies that HA resources are to be kept on separate +nodes, while the affinity `together` specifies that HA resources are to be kept +on the same node. |=========================================================== [[ha_manager_node_affinity_rules]] @@ -758,6 +762,88 @@ Node Affinity Rule Properties include::ha-rules-node-affinity-opts.adoc[] +[[ha_manager_resource_affinity_rules]] +Resource Affinity Rules +^^^^^^^^^^^^^^^^^^^^^^^ + +Another common requirement is that two or more HA resources should run on +either the same node, or should be distributed on separate nodes. These are +also commonly called "Affinity/Anti-Affinity constraints". + +For example, suppose there is a lot of communication traffic between the HA +resources `vm:100` and `vm:200`, e.g., a web server communicating with a +database server. If those HA resources are on separate nodes, this could +potentially result in a higher latency and unnecessary network load. Resource +affinity rules with the affinity `positive` implement the constraint to keep +the HA resources on the same node: + +---- +# ha-manager rules add resource-affinity keep-together \ + --affinity positive --resources vm:100,vm:200 +---- + +NOTE: If there are two or more positive resource affinity rules, which have +common HA resources, then these are treated as a single positive resource +affinity rule. For example, if the HA resources `vm:100` and `vm:101` and the +HA resources `vm:101` and `vm:102` are each in a positive resource affinity +rule, then it is the same as if `vm:100`, `vm:101` and `vm:102` would have been +in a single positive resource affinity rule. + +However, suppose there are computationally expensive, and/or distributed +programs running on the HA resources `vm:200` and `ct:300`, e.g., sharded +database instances. In that case, running them on the same node could +potentially result in pressure on the hardware resources of the node and will +slow down the operations of these HA resources. Resource affinity rules with +the affinity `negative` implement the constraint to spread the HA resources on +separate nodes: + +---- +# ha-manager rules add resource-affinity keep-separate \ + --affinity negative --resources vm:200,ct:300 +---- + +Other than node affinity rules, resource affinity rules are strict by default, +i.e., if the constraints imposed by the resource affinity rules cannot be met +for a HA resource, the HA Manager will put the HA resource in recovery state in +case of a failover or in error state elsewhere. + +The above commands created the following rules in the rules configuration file: + +.Resource Affinity Rules Configuration Example (`/etc/pve/ha/rules.cfg`) +---- +resource-affinity: keep-together + resources vm:100,vm:200 + affinity positive + +resource-affinity: keep-separate + resources vm:200,ct:300 + affinity negative +---- + +Interactions between Positive and Negative Resource Affinity Rules +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + +If there are HA resources in a positive resource affinity rule, which are also +part of a negative resource affinity rule, then all the other HA resources in +the positive resource affinity rule are in negative affinity with the HA +resources of these negative resource affinity rules as well. + +For example, if the HA resources `vm:100`, `vm:101`, and `vm:102` are in a +positive resource affinity rule, and `vm:100` is in a negative resource affinity +rule with the HA resource `ct:200`, then `vm:101` and `vm:102` are each in +negative resource affinity with `ct:200` as well. + +Note that if there are two or more HA resources in both a positive and negative +resource affinity rule, then those will be disabled as they cause a conflict: +Two or more HA resources cannot be kept on the same node and separated on +different nodes at the same time. For more information on these cases, see the +section about xref:ha_manager_rule_conflicts[rule conflicts and errors] below. + +Resource Affinity Rule Properties ++++++++++++++++++++++++++++++++++ + +include::ha-rules-resource-affinity-opts.adoc[] + [[ha_manager_rule_conflicts]] Rule Conflicts and Errors ~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -774,6 +860,43 @@ Currently, HA rules are checked for the following feasibility tests: total. If two or more HA node affinity rules specify the same HA resource, these HA node affinity rules will be disabled. +* A HA resource affinity rule must specify at least two HA resources to be + feasible. If a HA resource affinity rule does specify only one HA resource, + the HA resource affinity rule will be disabled. + +* A HA resource affinity rule must specify no more HA resources than there are + nodes in the cluster. If a HA resource affinity rule does specify more HA + resources than there are in the cluster, the HA resource affinity rule will be + disabled. + +* A positive HA resource affinity rule cannot specify the same two or more HA + resources as a negative HA resources affinity rule. That is, two or more HA + resources cannot be kept together and separate at the same time. If any pair + of positive and negative HA resource affinity rules do specify the same two or + more HA resources, both HA resource affinity rules will be disabled. + +* A HA resource, which is already constrained by a HA node affinity rule, can + only be referenced by a HA resource affinity rule, if the HA node affinity + rule does only use a single priority group. That is, the specified nodes in + the HA node affinity rule have the same priority. If one of the HA resources + in a HA resource affinity rule is constrainted by a HA node affinity rule with + multiple priority groups, the HA resource affinity rule will be disabled. + +* The HA resources of a positive HA resource affinity rule, which are + constrained by HA node affinity rules, must have at least one common node, + where the HA resources are allowed to run on. Otherwise, the HA resources + could only run on separate nodes. In other words, if two or more HA resources + of a positive HA resource affinity rule are constrained to different nodes, + the positive HA resource affinity rule will be disabled. + +* The HA resources of a negative HA resource affinity rule, which are + constrained by HA node affinity rules, must have at least enough nodes to + separate these constrained HA resources on. Otherwise, the HA resources do not + have enough nodes to be separated on. In other words, if two or more HA + resources of a negative HA resource affinity rule are constrained to less + nodes than needed to separate them on, the negative HA resource affinity rule + will be disabled. + [[ha_manager_fencing]] Fencing ------- @@ -1205,6 +1328,16 @@ The CRS is currently used at the following scheduling points: algorithm to ensure that these HA resources are assigned according to their node and priority constraints. +** Positive resource affinity rules: If a positive resource affinity rule is + created or HA resources are added to an existing positive resource affinity + rule, the HA stack will use the CRS algorithm to ensure that these HA + resources are moved to a common node. + +** Negative resource affinity rules: If a negative resource affinity rule is + created or HA resources are added to an existing negative resource affinity + rule, the HA stack will use the CRS algorithm to ensure that these HA + resources are moved to separate nodes. + - HA service stopped -> start transition (opt-in). Requesting that a stopped service should be started is an good opportunity to check for the best suited node as per the CRS algorithm, as moving stopped services is cheaper to do diff --git a/ha-rules-resource-affinity-opts.adoc b/ha-rules-resource-affinity-opts.adoc new file mode 100644 index 0000000..596ec3c --- /dev/null +++ b/ha-rules-resource-affinity-opts.adoc @@ -0,0 +1,8 @@ +`affinity`: `<negative | positive>` :: + +Describes whether the HA resources are supposed to be kept on the same node ('positive'), or are supposed to be kept on separate nodes ('negative'). + +`resources`: `<type>:<name>{,<type>:<name>}*` :: + +List of HA resource IDs. This consists of a list of resource types followed by a resource specific name separated with a colon (example: vm:100,ct:101). + -- 2.39.5 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel