On May 24, 2022 1:41 pm, Dominik Csapak wrote:
> when running replication, we don't want to keep replication states for
> non-local vms. Normally this would not be a problem, since on migration,
> we transfer the states anyway, but when the ha-manager steals a vm, it
> cannot do that. In that case, having an old state lying around is
> harmful, since the code does not expect the state to be out-of-sync
> with the actual snapshots on disk.
> 
> One such problem is the following:
> 
> Replicate vm 100 from node A to node B and C, and activate HA. When node
> A dies, it will be relocated to e.g. node B and start replicate from
> there. If node B now had an old state lying around for it's sync to node
> C, it might delete the common base snapshots of B and C and cannot sync
> again.
> 
> Deleting the state for all non local guests fixes that issue, since it
> always starts fresh, and the potentially existing old state cannot be
> valid anyway since we just relocated the vm here (from a dead node).
> 
> Signed-off-by: Dominik Csapak <d.csa...@proxmox.com>

the logic seems sound, the state *is* invalid/outdated once the guest 
has been stolen..

Reviewed-by: Fabian Grünbichler <f.gruenbich...@proxmox.com>

> ---
> i tested it in various configurations and with live migration, offline
> migration, ha relocation etc. and did not find an issue, but since
> the check was already there and commented out, maybe someone has a
> better idea why this might not be a good thing, so sending as RFC
> 
>  src/PVE/ReplicationState.pm | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/PVE/ReplicationState.pm b/src/PVE/ReplicationState.pm
> index 0a5e410..8eebb42 100644
> --- a/src/PVE/ReplicationState.pm
> +++ b/src/PVE/ReplicationState.pm
> @@ -215,7 +215,7 @@ sub purge_old_states {
>       my $tid = $plugin->get_unique_target_id($jobcfg);
>       my $vmid = $jobcfg->{guest};
>       $used_tids->{$vmid}->{$tid} = 1
> -         if defined($vms->{ids}->{$vmid}); # && $vms->{ids}->{$vmid}->{node} 
> eq $local_node;
> +         if defined($vms->{ids}->{$vmid}) && $vms->{ids}->{$vmid}->{node} eq 
> $local_node;
>      }
>  
>      my $purge_state = sub {
> -- 
> 2.30.2
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Reply via email to