RFC v1: https://lore.proxmox.com/pve-devel/20250325151254.193177-1-d.k...@proxmox.com/ RFC v2: https://lore.proxmox.com/pve-devel/20250620143148.218469-1-d.k...@proxmox.com/
I've separated the core HA Rules module and the transformation from HA groups to HA Node Affinity rules (formerly known as HA Location rules) in this patch series, to reduce the overhead for reviewers and strive for a better version history, as changing two things at a time is rather confusing. The main things that have changed since the last version (v2): - split up the patch series (ofc) - rebased on newest available master - renamed "HA Location Rule" to "HA Node Affinity Rule" - renamed any reference of a 'HA service' to 'HA resource' (e.g. rules property 'services' is now 'resources') - converted tri-state property 'state' to a binary 'disable' flag on HA rules and expose the 'contradictory' state with an 'errors' hash - remove the "use-location-rules" feature flag and implement a more straightforward ha groups migration (more on that below) - remove any reference of ha groups from the web interface As before, HA groups are migrated to HA node affinity rules in each HA Manager round where something has changed about the HA groups / HA resources config file, but these are now unconditionally done as soon as a HA Manager runs with that version. It will also try to persistently migrate these, but that will only be successful as soon as all other nodes are upgraded (i.e. every node can run at least the HA Manager version that can successfully parse and apply the HA rules). There are still some things left to do, which I didn't get the time to come around to do for this revision: - Testing, testing, testing - I've ran out of time on the persistent HA groups migration part, which has at least the two TODOs, which are mentioned in the patch itself, and I haven't tested them on any real PVE upgrade yet; It's more of a draft on how the migration should potentially work - Also, the last patch for the persistent HA groups migration part will fail the tests but the two that have been added, because of the way the other tests are designed; that should be abstracted away in the HA environment, e.g., a routine "have_groups_been_migrated" for PVE2/Sim. - There might be a bit too many in-memory group migrations on the HA Rules API side now, but better safe then sorry, maybe they can be removed later; however, these shouldn't overwrite the rules that come from the config, I haven't checked on that yet - Should the HA Groups API (and the HA Resources 'group' property in the HA Resources API) be removed now? Or should these stay and uses of them make auto-migrations to the HA Rules? As in the previous revisions, I've run a git rebase master --exec 'make clean && make deb' on the series, so the tests should work for every patch. cluster: Daniel Kral (1): cfs: add 'ha/rules.cfg' to observed files src/PVE/Cluster.pm | 1 + src/pmxcfs/status.c | 1 + 2 files changed, 2 insertions(+) base-commit: 60e36c87b0fffe6dbdd5b1be72a9273b6f7cec2b prerequisite-patch-id: 50b1021d35ecf86562d33dc6068c90e219557ab7 prerequisite-patch-id: 0374f409a039eebe9dd7587d6c018ef71ac2c67d prerequisite-patch-id: d17849368da2aa61fcab9e08235f8673a2d0258e ha-manager: Daniel Kral (15): tree-wide: make arguments for select_service_node explicit manager: improve signature of select_service_node introduce rules base plugin rules: introduce node affinity rule plugin config, env, hw: add rules read and parse methods config: delete services from rules if services are deleted from config manager: read and update rules config test: ha tester: add test cases for future node affinity rules resources: introduce failback property in ha resource config manager: migrate ha groups to node affinity rules in-memory manager: apply node affinity rules when selecting service nodes test: add test cases for rules config api: introduce ha rules api endpoints cli: expose ha rules api endpoints to ha-manager cli manager: persistently migrate ha groups to ha rules .gitignore | 1 + debian/pve-ha-manager.install | 3 + src/PVE/API2/HA/Makefile | 2 +- src/PVE/API2/HA/Resources.pm | 9 + src/PVE/API2/HA/Rules.pm | 391 +++++++++++++++ src/PVE/API2/HA/Status.pm | 11 +- src/PVE/CLI/ha_manager.pm | 32 ++ src/PVE/HA/Config.pm | 58 ++- src/PVE/HA/Env.pm | 30 ++ src/PVE/HA/Env/PVE2.pm | 40 ++ src/PVE/HA/Groups.pm | 48 ++ src/PVE/HA/Makefile | 3 +- src/PVE/HA/Manager.pm | 259 ++++++---- src/PVE/HA/Resources.pm | 9 + src/PVE/HA/Resources/PVECT.pm | 1 + src/PVE/HA/Resources/PVEVM.pm | 1 + src/PVE/HA/Rules.pm | 455 ++++++++++++++++++ src/PVE/HA/Rules/Makefile | 6 + src/PVE/HA/Rules/NodeAffinity.pm | 296 ++++++++++++ src/PVE/HA/Sim/Env.pm | 44 ++ src/PVE/HA/Sim/Hardware.pm | 44 ++ src/PVE/HA/Tools.pm | 46 ++ src/test/Makefile | 4 +- .../defaults-for-node-affinity-rules.cfg | 22 + ...efaults-for-node-affinity-rules.cfg.expect | 60 +++ ...e-resource-refs-in-node-affinity-rules.cfg | 31 ++ ...rce-refs-in-node-affinity-rules.cfg.expect | 63 +++ src/test/test-group-migrate1/README | 10 + src/test/test-group-migrate1/cmdlist | 3 + src/test/test-group-migrate1/groups | 7 + src/test/test-group-migrate1/hardware_status | 5 + src/test/test-group-migrate1/log.expect | 306 ++++++++++++ src/test/test-group-migrate1/manager_status | 1 + src/test/test-group-migrate1/service_config | 5 + src/test/test-group-migrate2/README | 10 + src/test/test-group-migrate2/cmdlist | 3 + src/test/test-group-migrate2/groups | 7 + src/test/test-group-migrate2/hardware_status | 5 + src/test/test-group-migrate2/log.expect | 47 ++ src/test/test-group-migrate2/manager_status | 1 + src/test/test-group-migrate2/service_config | 5 + src/test/test-node-affinity-nonstrict1/README | 10 + .../test-node-affinity-nonstrict1/cmdlist | 4 + src/test/test-node-affinity-nonstrict1/groups | 2 + .../hardware_status | 5 + .../test-node-affinity-nonstrict1/log.expect | 40 ++ .../manager_status | 1 + .../service_config | 3 + src/test/test-node-affinity-nonstrict2/README | 12 + .../test-node-affinity-nonstrict2/cmdlist | 4 + src/test/test-node-affinity-nonstrict2/groups | 3 + .../hardware_status | 5 + .../test-node-affinity-nonstrict2/log.expect | 35 ++ .../manager_status | 1 + .../service_config | 3 + src/test/test-node-affinity-nonstrict3/README | 10 + .../test-node-affinity-nonstrict3/cmdlist | 4 + src/test/test-node-affinity-nonstrict3/groups | 2 + .../hardware_status | 5 + .../test-node-affinity-nonstrict3/log.expect | 56 +++ .../manager_status | 1 + .../service_config | 5 + src/test/test-node-affinity-nonstrict4/README | 14 + .../test-node-affinity-nonstrict4/cmdlist | 4 + src/test/test-node-affinity-nonstrict4/groups | 2 + .../hardware_status | 5 + .../test-node-affinity-nonstrict4/log.expect | 54 +++ .../manager_status | 1 + .../service_config | 5 + src/test/test-node-affinity-nonstrict5/README | 16 + .../test-node-affinity-nonstrict5/cmdlist | 5 + src/test/test-node-affinity-nonstrict5/groups | 2 + .../hardware_status | 5 + .../test-node-affinity-nonstrict5/log.expect | 66 +++ .../manager_status | 1 + .../service_config | 3 + src/test/test-node-affinity-nonstrict6/README | 14 + .../test-node-affinity-nonstrict6/cmdlist | 5 + src/test/test-node-affinity-nonstrict6/groups | 3 + .../hardware_status | 5 + .../test-node-affinity-nonstrict6/log.expect | 52 ++ .../manager_status | 1 + .../service_config | 3 + src/test/test-node-affinity-strict1/README | 10 + src/test/test-node-affinity-strict1/cmdlist | 4 + src/test/test-node-affinity-strict1/groups | 3 + .../hardware_status | 5 + .../test-node-affinity-strict1/log.expect | 40 ++ .../test-node-affinity-strict1/manager_status | 1 + .../test-node-affinity-strict1/service_config | 3 + src/test/test-node-affinity-strict2/README | 11 + src/test/test-node-affinity-strict2/cmdlist | 4 + src/test/test-node-affinity-strict2/groups | 4 + .../hardware_status | 5 + .../test-node-affinity-strict2/log.expect | 40 ++ .../test-node-affinity-strict2/manager_status | 1 + .../test-node-affinity-strict2/service_config | 3 + src/test/test-node-affinity-strict3/README | 10 + src/test/test-node-affinity-strict3/cmdlist | 4 + src/test/test-node-affinity-strict3/groups | 3 + .../hardware_status | 5 + .../test-node-affinity-strict3/log.expect | 74 +++ .../test-node-affinity-strict3/manager_status | 1 + .../test-node-affinity-strict3/service_config | 5 + src/test/test-node-affinity-strict4/README | 14 + src/test/test-node-affinity-strict4/cmdlist | 4 + src/test/test-node-affinity-strict4/groups | 3 + .../hardware_status | 5 + .../test-node-affinity-strict4/log.expect | 54 +++ .../test-node-affinity-strict4/manager_status | 1 + .../test-node-affinity-strict4/service_config | 5 + src/test/test-node-affinity-strict5/README | 16 + src/test/test-node-affinity-strict5/cmdlist | 5 + src/test/test-node-affinity-strict5/groups | 3 + .../hardware_status | 5 + .../test-node-affinity-strict5/log.expect | 66 +++ .../test-node-affinity-strict5/manager_status | 1 + .../test-node-affinity-strict5/service_config | 3 + src/test/test-node-affinity-strict6/README | 14 + src/test/test-node-affinity-strict6/cmdlist | 5 + src/test/test-node-affinity-strict6/groups | 4 + .../hardware_status | 5 + .../test-node-affinity-strict6/log.expect | 52 ++ .../test-node-affinity-strict6/manager_status | 1 + .../test-node-affinity-strict6/service_config | 3 + src/test/test_failover1.pl | 27 +- src/test/test_rules_config.pl | 100 ++++ 127 files changed, 3398 insertions(+), 95 deletions(-) create mode 100644 src/PVE/API2/HA/Rules.pm create mode 100644 src/PVE/HA/Rules.pm create mode 100644 src/PVE/HA/Rules/Makefile create mode 100644 src/PVE/HA/Rules/NodeAffinity.pm create mode 100644 src/test/rules_cfgs/defaults-for-node-affinity-rules.cfg create mode 100644 src/test/rules_cfgs/defaults-for-node-affinity-rules.cfg.expect create mode 100644 src/test/rules_cfgs/multiple-resource-refs-in-node-affinity-rules.cfg create mode 100644 src/test/rules_cfgs/multiple-resource-refs-in-node-affinity-rules.cfg.expect create mode 100644 src/test/test-group-migrate1/README create mode 100644 src/test/test-group-migrate1/cmdlist create mode 100644 src/test/test-group-migrate1/groups create mode 100644 src/test/test-group-migrate1/hardware_status create mode 100644 src/test/test-group-migrate1/log.expect create mode 100644 src/test/test-group-migrate1/manager_status create mode 100644 src/test/test-group-migrate1/service_config create mode 100644 src/test/test-group-migrate2/README create mode 100644 src/test/test-group-migrate2/cmdlist create mode 100644 src/test/test-group-migrate2/groups create mode 100644 src/test/test-group-migrate2/hardware_status create mode 100644 src/test/test-group-migrate2/log.expect create mode 100644 src/test/test-group-migrate2/manager_status create mode 100644 src/test/test-group-migrate2/service_config create mode 100644 src/test/test-node-affinity-nonstrict1/README create mode 100644 src/test/test-node-affinity-nonstrict1/cmdlist create mode 100644 src/test/test-node-affinity-nonstrict1/groups create mode 100644 src/test/test-node-affinity-nonstrict1/hardware_status create mode 100644 src/test/test-node-affinity-nonstrict1/log.expect create mode 100644 src/test/test-node-affinity-nonstrict1/manager_status create mode 100644 src/test/test-node-affinity-nonstrict1/service_config create mode 100644 src/test/test-node-affinity-nonstrict2/README create mode 100644 src/test/test-node-affinity-nonstrict2/cmdlist create mode 100644 src/test/test-node-affinity-nonstrict2/groups create mode 100644 src/test/test-node-affinity-nonstrict2/hardware_status create mode 100644 src/test/test-node-affinity-nonstrict2/log.expect create mode 100644 src/test/test-node-affinity-nonstrict2/manager_status create mode 100644 src/test/test-node-affinity-nonstrict2/service_config create mode 100644 src/test/test-node-affinity-nonstrict3/README create mode 100644 src/test/test-node-affinity-nonstrict3/cmdlist create mode 100644 src/test/test-node-affinity-nonstrict3/groups create mode 100644 src/test/test-node-affinity-nonstrict3/hardware_status create mode 100644 src/test/test-node-affinity-nonstrict3/log.expect create mode 100644 src/test/test-node-affinity-nonstrict3/manager_status create mode 100644 src/test/test-node-affinity-nonstrict3/service_config create mode 100644 src/test/test-node-affinity-nonstrict4/README create mode 100644 src/test/test-node-affinity-nonstrict4/cmdlist create mode 100644 src/test/test-node-affinity-nonstrict4/groups create mode 100644 src/test/test-node-affinity-nonstrict4/hardware_status create mode 100644 src/test/test-node-affinity-nonstrict4/log.expect create mode 100644 src/test/test-node-affinity-nonstrict4/manager_status create mode 100644 src/test/test-node-affinity-nonstrict4/service_config create mode 100644 src/test/test-node-affinity-nonstrict5/README create mode 100644 src/test/test-node-affinity-nonstrict5/cmdlist create mode 100644 src/test/test-node-affinity-nonstrict5/groups create mode 100644 src/test/test-node-affinity-nonstrict5/hardware_status create mode 100644 src/test/test-node-affinity-nonstrict5/log.expect create mode 100644 src/test/test-node-affinity-nonstrict5/manager_status create mode 100644 src/test/test-node-affinity-nonstrict5/service_config create mode 100644 src/test/test-node-affinity-nonstrict6/README create mode 100644 src/test/test-node-affinity-nonstrict6/cmdlist create mode 100644 src/test/test-node-affinity-nonstrict6/groups create mode 100644 src/test/test-node-affinity-nonstrict6/hardware_status create mode 100644 src/test/test-node-affinity-nonstrict6/log.expect create mode 100644 src/test/test-node-affinity-nonstrict6/manager_status create mode 100644 src/test/test-node-affinity-nonstrict6/service_config create mode 100644 src/test/test-node-affinity-strict1/README create mode 100644 src/test/test-node-affinity-strict1/cmdlist create mode 100644 src/test/test-node-affinity-strict1/groups create mode 100644 src/test/test-node-affinity-strict1/hardware_status create mode 100644 src/test/test-node-affinity-strict1/log.expect create mode 100644 src/test/test-node-affinity-strict1/manager_status create mode 100644 src/test/test-node-affinity-strict1/service_config create mode 100644 src/test/test-node-affinity-strict2/README create mode 100644 src/test/test-node-affinity-strict2/cmdlist create mode 100644 src/test/test-node-affinity-strict2/groups create mode 100644 src/test/test-node-affinity-strict2/hardware_status create mode 100644 src/test/test-node-affinity-strict2/log.expect create mode 100644 src/test/test-node-affinity-strict2/manager_status create mode 100644 src/test/test-node-affinity-strict2/service_config create mode 100644 src/test/test-node-affinity-strict3/README create mode 100644 src/test/test-node-affinity-strict3/cmdlist create mode 100644 src/test/test-node-affinity-strict3/groups create mode 100644 src/test/test-node-affinity-strict3/hardware_status create mode 100644 src/test/test-node-affinity-strict3/log.expect create mode 100644 src/test/test-node-affinity-strict3/manager_status create mode 100644 src/test/test-node-affinity-strict3/service_config create mode 100644 src/test/test-node-affinity-strict4/README create mode 100644 src/test/test-node-affinity-strict4/cmdlist create mode 100644 src/test/test-node-affinity-strict4/groups create mode 100644 src/test/test-node-affinity-strict4/hardware_status create mode 100644 src/test/test-node-affinity-strict4/log.expect create mode 100644 src/test/test-node-affinity-strict4/manager_status create mode 100644 src/test/test-node-affinity-strict4/service_config create mode 100644 src/test/test-node-affinity-strict5/README create mode 100644 src/test/test-node-affinity-strict5/cmdlist create mode 100644 src/test/test-node-affinity-strict5/groups create mode 100644 src/test/test-node-affinity-strict5/hardware_status create mode 100644 src/test/test-node-affinity-strict5/log.expect create mode 100644 src/test/test-node-affinity-strict5/manager_status create mode 100644 src/test/test-node-affinity-strict5/service_config create mode 100644 src/test/test-node-affinity-strict6/README create mode 100644 src/test/test-node-affinity-strict6/cmdlist create mode 100644 src/test/test-node-affinity-strict6/groups create mode 100644 src/test/test-node-affinity-strict6/hardware_status create mode 100644 src/test/test-node-affinity-strict6/log.expect create mode 100644 src/test/test-node-affinity-strict6/manager_status create mode 100644 src/test/test-node-affinity-strict6/service_config create mode 100755 src/test/test_rules_config.pl base-commit: 264dc2c58d145394219f82f25d41f4fc438c4dc4 prerequisite-patch-id: 530b875c25a6bded1cc2294960cf465d5c2bcbca docs: Daniel Kral (1): ha: add documentation about ha rules and ha node affinity rules Makefile | 2 + gen-ha-rules-node-affinity-opts.pl | 20 ++++++ gen-ha-rules-opts.pl | 17 +++++ ha-manager.adoc | 103 +++++++++++++++++++++++++++++ ha-rules-node-affinity-opts.adoc | 18 +++++ ha-rules-opts.adoc | 12 ++++ pmxcfs.adoc | 1 + 7 files changed, 173 insertions(+) create mode 100755 gen-ha-rules-node-affinity-opts.pl create mode 100755 gen-ha-rules-opts.pl create mode 100644 ha-rules-node-affinity-opts.adoc create mode 100644 ha-rules-opts.adoc base-commit: 7cc17ee5950a53bbd5b5ad81270352ccdb1c541c prerequisite-patch-id: 92556cd6c1edfb88b397ae244d7dcd56876cd8fb manager: Daniel Kral (3): api: ha: add ha rules api endpoints ui: ha: remove ha groups from ha resource components ui: ha: show failback flag in resources status view PVE/API2/HAConfig.pm | 8 +++++++- www/manager6/ha/ResourceEdit.js | 16 ++++++++++++---- www/manager6/ha/Resources.js | 17 +++-------------- www/manager6/ha/StatusView.js | 5 ++++- 4 files changed, 26 insertions(+), 20 deletions(-) base-commit: c0cbe76ee90e7110934c50414bc22371cf13c01a prerequisite-patch-id: ec6a39936719cfe38787fccb1a80af6378980723 Summary over all repositories: 140 files changed, 3599 insertions(+), 115 deletions(-) -- Generated by git-murpp 0.8.0 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel