16] HA colocation rules

Fiona Ebner Thu, 24 Apr 2025 03:19:32 -0700

Am 25.03.25 um 16:12 schrieb Daniel Kral:
> | Canonicalization
> ----------
> 
> Additionally, colocation rules are currently simplified as follows:
> 
> - If there are multiple positive colocation rules with common services
>   and the same strictness, these are merged to a single positive
>   colocation rule.


Do you intend to do that when writing the configuration file? I think
rules are better left unmerged from a user perspective. For example:

- services 1, 2 and 3 should strictly stay together, because of reason A
- services 1 and 3 should strictly stay together, because of different
reason B

Another scenario might be that the user is currently in the process of
editing some rules one-by-one and then it might also be surprising if
something is auto-merged.

You can of course always dynamically merge them when doing the
computation for the node selection.

In the same spirit, a comment field for each rule where the user can put
the reason might be nice to have.

Another question is if we should allow enabling/disabling rules.

Comment and enabling can of course always be added later. I'm just not
sure we should start out with the auto-merging of rules.

> | Inference rules
> ----------
> 
> There are currently no inference rules implemented for the RFC, but
> there could be potential to further simplify some code paths in the
> future, e.g. a positive colocation rule where one service is part of a
> restricted HA group makes the other services in the positive colocation
> rule a part of this HA group as well.

If the rule is strict. If we do this I think it should only happen
dynamically for the node selection too.


> Comment about HA groups -> Location Rules
> -----------------------------------------
> 
> This part is not really part of the patch series, but still worth for an
> on-list discussion.
> 
> I'd like to suggest to also transform the existing HA groups to location
> rules, if the rule concept turns out to be a good fit for the colocation
> feature in the HA Manager, as HA groups seem to integrate quite easily
> into this concept.
> 
> This would make service-node relationships a little more flexible for
> users and we'd be able to have both configurable / visible in the same
> WebUI view, API endpoint, and configuration file. Also, some code paths
> could be a little more consise, e.g. checking changes to constraints and
> canonicalizing the rules config.
> 
> The how should be rather straightforward for the obvious use cases:
> 
> - Services in unrestricted HA groups -> Location rules with the nodes of
>   the HA group; We could either split each node priority group into
>   separate location rules (with each having their score / weight) or
>   keep the input format of HA groups with a list of
>   `<node>(:<priority>)` in each rule
> 
> - Services in restricted HA groups -> Same as above, but also using
>   either `+inf` for a mandatory location rule or `strict` property
>   depending on how we decide on the colocation rule properties

I'd prefer having a 'strict' property, as that is orthogonal to the
priorities and that aligns it with what you propose for the colocation
rules.

> This would allow most of the use cases of HA groups to be easily
> migratable to location rules. We could also keep the inference of the
> 'default group' for unrestricted HA groups (any node that is available
> is added as a group member with priority -1).

Nodes can change, so adding them explicitly will mean it can get
outdated. This should be implicit/done dynamically.

> The only thing that I'm unsure about this, is how we would migrate the
> `nofailback` option, since this operates on the group-level. If we keep
> the `<node>(:<priority>)` syntax and restrict that each service can only
> be part of one location rule, it'd be easy to have the same flag. If we
> go with multiple location rules per service and each having a score or
> weight (for the priority), then we wouldn't be able to have this flag
> anymore. I think we could keep the semantic if we move this flag to the
> service config, but I'm thankful for any comments on this.
My gut feeling is that going for a more direct mapping, i.e. each
location rule represents one HA group, is better. The nofailback flag
can still apply to a given location rule I think? For a given service,
if a higher-priority node is online for any location rule the service is
part of, with nofailback=0, it will get migrated to that higher-priority
node. It does make sense to have a given service be part of only one
location rule then though, since node priorities can conflict between rules.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Re: [pve-devel] [RFC cluster/ha-manager 00/16] HA colocation rules

Reply via email to