thx Wei - that should (so I was told) kill the DESTINATION VM on failed
migrations - i.e. perform cleanup - so that is OK?

On Fri, 22 Nov 2019 at 10:59, Wei ZHOU <[email protected]> wrote:

> Hi Andrija,
>
> As I remember, it happened on our production few years ago.
>
> https://github.com/apache/cloudstack/blob/master/engine/orchestration/src/main/java/com/cloud/vm/VirtualMachineManagerImpl.java#L2962-L2983
>
>
>  -Wei
>
> On Fri, 22 Nov 2019 at 09:34, Andrija Panic <[email protected]>
> wrote:
>
> > Thx both, thx Wei - that sounds all interesting.
> >
> > as for "vm migration fails and no retry in cloudstack" - this should NOT
> > trigger stopping the VM - at least what I saw so far - simply host will
> be
> > in ErrorMaintenance - can you confirm VMs are not stopped in this case?
> >
> > On Fri, 22 Nov 2019 at 08:54, Wei ZHOU <[email protected]> wrote:
> >
> > > Hi Andrija,
> > >
> > > We have faces some vm migration issues. There are three categories
> > actually
> > > 1. vm migration fails due to different hardware or software on source
> and
> > > destination hosts, for example, cpu models. vm will be still running on
> > > source hosts.
> > > you may find some errors in agent.log.
> > > 2. vm migration fails due to some libvirt/qemu bugs. you may find some
> > > errors in /var/log/libvirt/qemu/ folder (on ubuntu) on the source or
> > > destination host.
> > > mostly the vm will be still running on source host. In rare cases the
> vm
> > is
> > > stopped.
> > > 3. vm is stopped due to some cloudstack bugs. for example, when we put
> a
> > > host to maintenance, the vm will be stopped if (1) no other host is Up
> in
> > > same cluster, or (2) vm migration fails and no retry in cloudstack, or
> > (3)
> > > multiple vms are migrated to same destination at the same time but
> there
> > is
> > > no enough memory on the destination.
> > >
> > > We need to fix the issues mentioned in part 3 above in cloudstack.
> > >
> > > In Leaseweb, to improve the vm migration
> > > (1) we use custom cpu model , see
> > >
> > >
> >
> http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/hypervisor/kvm.html#configure-cpu-model-for-kvm-guest-optional
> > > (2) we have build our own qemu packages with some bug fixes for
> > > installation
> > > (3) we have some fixes in our fork from 4.7.1. We have not tested with
> > > 4.13/4.14.
> > > We still see failed vm migration sometimes. However the vms will not be
> > > stopped if migration fails.
> > >
> > > -Wei
> > >
> > > On Fri, 22 Nov 2019 at 01:54, Andrija Panic <[email protected]>
> > > wrote:
> > >
> > > > ( @Sven, not being able to migrate Vm with ISO attached - don't
> recall
> > > > testing/doing that recently - but is technically perfectly possible,
> > > unless
> > > > we don't support it via CloudStack - feel free to open GitHub issue
> > with
> > > > correct steps to reproduce etc)
> > > >
> > > > On Fri, 22 Nov 2019 at 01:47, Andrija Panic <[email protected]
> >
> > > > wrote:
> > > >
> > > > > That sucks...thx both.
> > > > >
> > > > > @both - which ACS version do you use (and encounter such issues?)
> > > > >
> > > > > Ubuntu comes with a whole another set of issues (I was losing my
> > nerves
> > > > > around very idiotic things, last time a week ago...) - though most
> > can
> > > be
> > > > > managed with some workarounds.
> > > > > But yes, Qemu/libvirt should be better with Ubuntu - free of RedHat
> > > > > s$^%tty business politics - i.e. in CentOS 6.x you were able to
> live
> > > > > migrate VM WITH all the volumes to another host/storage. On CentOS
> 7
> > > you
> > > > > can't do that any more, unless you are using qemu-kvm-ev (but not
> the
> > > > > regular one from the SIG CentOS repo, you need the one from the
> oVirt
> > > > > project)
> > > > >
> > > > > I'm just trying to understand if this is happening also on i.e. ACS
> > > 4.11
> > > > -
> > > > > so to stop digging around the problem (and assume it's purely
> CentOS
> > > > which
> > > > > is broken - why all great things need to come to an end...damn it)
> > > > >
> > > > > (well I could also test same ACS code on Ubuntu and see if no
> issues
> > > > there
> > > > > with live migrations..)
> > > > >
> > > > > Thanks
> > > > > Andrija
> > > > >
> > > > > On Thu, 21 Nov 2019 at 23:39, Jean-Francois Nadeau <
> > > > [email protected]>
> > > > > wrote:
> > > > >
> > > > >> Hi Andrija,
> > > > >>
> > > > >> We experienced that problem with stock packages on CentOS 7.4.
> > Live
> > > > >> migration would frequently fail and leave the VM dead.    We since
> > > moved
> > > > >> to
> > > > >> RHEV packages for qemu.  Libvirt is still stock per CentoS 7.6
> > (4.5).
> > > >  I
> > > > >> want to say the situation improved but I can't tell yet if we
> have a
> > > > 100%
> > > > >> success rate on live migrations (as it should be !)
> > > > >>
> > > > >> Redhat also have been messing up severely with stock  libvirt
> > versions
> > > > >> between 7.4/7.5/7.6 in such way it broke live migration
> > compatibility
> > > > (cpu
> > > > >> definitions).   Im at the crossroads right now to entirely ditch
> > > > >> centos/redhat in favor of Ubuntu to have well tested stock
> packages.
> > > > >>
> > > > >> best,
> > > > >>
> > > > >> -Jfn
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Thu, Nov 21, 2019 at 5:25 PM Andrija Panic <
> > > [email protected]>
> > > > >> wrote:
> > > > >>
> > > > >> > Hi guys.
> > > > >> >
> > > > >> > I wanted to see if any of you have seen similar/same in master,
> as
> > > > >> below.
> > > > >> >
> > > > >> > I've been testing some work/PRs (against the current master) and
> > > I've
> > > > >> seen
> > > > >> > that VMs will crash/be stopped occasionally when live migration
> is
> > > > >> > happening. I experienced this on an NEW/EMPTY env, with 2 KVM
> > hosts,
> > > > and
> > > > >> > only SSVM and CPVM - not a capacity issues or similar.
> > > > >> >
> > > > >> > This is happening with CentOS 7 (CentOS 7.3 I believe, but we
> also
> > > > >> updated
> > > > >> > packages to the latest stock ones and same issue was happening
> > > again).
> > > > >> >
> > > > >> > This is still under investigation, but I was wondering if anyone
> > > else
> > > > >> has
> > > > >> > seen similar thing happening?
> > > > >> >
> > > > >> > Best,
> > > > >> >
> > > > >> > --
> > > > >> >
> > > > >> > Andrija Panić
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Andrija Panić
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Andrija Panić
> > > >
> > >
> >
> >
> > --
> >
> > Andrija Panić
> >
>


-- 

Andrija Panić

Reply via email to