Am 30.04.25 um 13:09 schrieb Daniel Kral: > On 3/25/25 16:12, Daniel Kral wrote: >> sub select_service_node { >> - my ($groups, $online_node_usage, $sid, $service_conf, >> $current_node, $try_next, $tried_nodes, $maintenance_fallback, >> $best_scored) = @_; >> + # TODO Cleanup this signature post-RFC >> + my ($rules, $groups, $online_node_usage, $sid, $service_conf, >> $current_node, $try_next, $tried_nodes, $maintenance_fallback, >> $best_scored) = @_; > > I'm currently trying to clean up the helper's signature here, but doing > something like > > sub select_service_node { > my ($service_info, $affinity_info, $try_next, $best_scored) = @_; > > my ($sid, $service_conf, $current_node) = $service_info->@{qw(sid > config current_node)}; > my ($rules, $groups, $online_node_usage, $tried_nodes, > $maintenance_fallback) = > $affinity_info->@{qw(rules groups online_node_usage failed_nodes > maintenance_node)}; > > would require us to create helper structures on all four call sites (one > of them is just the test case ./test_failover1.pl), or introduce another > helper to just create them for passing it here and immediately de- > structuring it in select_service_node(...): > > sub get_service_affinity_info { > my ($self, $sid, $cd, $sd) = @_; > > my $service_info = { > sid => $sid, > config => $cd, > current_node => $sd->{node}, > }; > > my $affinity_info = { > rules => $self->{rules}, > groups => $self->{groups}, > failed_nodes => $sd->{failed_nodes}, > maintenance_node => $sd->{maintenance_node}, > online_node_usage => $self->{online_node_usage}, > }; > > return ($service_info, $affinity_info); > }; > > Also the call site in next_state_recovery(...) does not pass $sd- >>{failed_nodes}, $sd->{maintenance_node} and $best_scored to it. AFAICS > $sd->{failed_nodes} should be undef in next_state_recovery(...) anyway, > but I feel like I have missed some states it could be in there. And $sd- >>{maintenance_node} could be set anytime.
I think it makes sense to have it explicitly (rather than just implicitly) opt-out of $try_next just like the caller for rebalance_on_request_start. Without $try_next, the $tried_nodes parameter does not have any effect (the caller for rebalance_on_request_start passes it, but select_service_node() won't read it if $try_next isn't set). The caller in next_state_recovery() should also pass $best_scored IMHO, so that it is fully aligned with the caller for rebalance_on_request_start. It won't be an actual change result-wise, because for recovery, the current node is not available, so it already cannot be the result, but it makes sense semantically. We want the best scored node for recovery. And having the two callers look the same is a simplification too. > If there's nothing speaking against that, I'd prefer to elevate > select_service_node(...) to be a method as it needs quite a lot of state > anyway, especially as we will need global information about other > services than just the current one in the future anyway. I don't have strong feelings about this, both approaches seem fine to me. > So, I'd do something like > > sub select_service_node { > my ($self, $sid, $service_conf, $sd, $mode) = @_; > > my ($rules, $groups, $online_node_usage) = $self->@{qw(rules groups > online_node_usage)}; If we don't want to make it a method, we could still pass these ones separately. After implementing location rules, $groups would be dropped anyways. > my ($current_node, $tried_nodes, $maintenance_fallback) = $self- >>@{qw(node failed_nodes maintenance_node)}; It's $sd->... here. > here. It's not fancy as in there's a well-defined interface one can > immediately see what this helper needs (as it has access to the whole > $self) and doesn't have the guarantees of a standalone helper (won't > touch $self), but I think it could be better than creating helper > structures which are only pass a message, which is immediately > destructured anyway. We could also just pass $self slightly differently, > but I don't see much difference there. > > The $mode could then be a enumeration of e.g. whether $try_next (e.g. > 'try_again') or $best_scored (e.g. 'rebalance') is used (and can be Having a mode sounds good to me. I don't think it should be called 'rebalance', the best-scored semantics should apply to recovery too, see above. > extended of course). Those are mutually exclusive in the three call > sites right now. If next_state_recovery(...) really does have states > where $tried_nodes is set (and $maintenance_node too), then we can also > introduce a 'recovery' state, which will ignore them. > > The names for $service_conf and $sd can also be improved, but I wanted > to introduce minimal change to select_service_node(...) as well as stay > to the $sd name for the service data as in other places of the Manager.pm. > > That's still just a work in progress and I'd very appreciate some > feedback if any of the two above are viable options here. If it helps > any, I'd send the result as a separate series in advance which the HA > colocation will then be based on, so we don't loose focus in the HA > colocation patch series. > > CC'd @Fiona and @Fabian here, if you have any thoughts here :). _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel