>>I wonder wether reusing (/extending) the existing SSH tunnel for the >>commands we run on the target node might reduce the overhead as well? >>for cleanup in error cases opening a new connection is probably still >>advisable.
yes maybe. Don't known if the time is to fork the qm process, or established the ssh tunnel or get response. I'll try to add timer on this. another idea, why not use https api call through pveproxy directly ? I have verified with qmp status, without pvesr call , around 20ms 2017-07-28 10:24:45,184 -- VM status: paused (inmigrate) 2017-07-28 10:24:45,208 -- VM status: running with pvesr call , around 4s 2017-07-28 10:38:28,711 -- VM status: paused (inmigrate) 2017-07-28 10:38:28,745 -- VM status: paused 2017-07-28 10:38:28,799 -- VM status: paused 2017-07-28 10:38:28,818 -- VM status: paused 2017-07-28 10:38:28,837 -- VM status: paused .... 2017-07-28 10:38:33,912 -- VM status: running ----- Mail original ----- De: "Fabian Grünbichler" <f.gruenbich...@proxmox.com> À: "pve-devel" <pve-devel@pve.proxmox.com> Envoyé: Vendredi 28 Juillet 2017 10:46:55 Objet: Re: [pve-devel] Bug 1458 - PVE 5 live migration downtime degraded to several seconds (compared to PVE 4) On Fri, Jul 28, 2017 at 10:09:55AM +0200, Alexandre DERUMIER wrote: > > I have added some timer and done a migration without storage replication > > ->main migration loop : 150ms increase. (it's lower if I put a usleep of 1ms) > > 2017-07-28 10:00:10 transfer_replication_state: 1.436832 > 2017-07-28 10:00:10 move config: 0.001174 > 2017-07-28 10:00:10 switch_replication_job_target: 0.003125 > 2017-07-28 10:00:12 qm resume: 1.634583 -> (this is the time from source, to > get the response, not sure how many time it take exactly on remote) I guess only marginally less on the target until the VM is actually resumed. > > seem to be transfer_replication_state which call > my $cmd = [ @{$self->{rem_ssh}}, 'pvesr', 'set-state', $self->{vmid}, > $state]; > > > I think calling remote qm command take some time to get response. > Note that I don't use pvesr, so I think we should bypass theses commands if > not needed. > yes, checking whether a state / job exists earlier on, and only transferring state and switching the direction conditionally if needed would be an improvement for sure. I wonder wether reusing (/extending) the existing SSH tunnel for the commands we run on the target node might reduce the overhead as well? for cleanup in error cases opening a new connection is probably still advisable. those two improvements might get us into the <1s range again, without sacrificing consistency on the way. _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel