true, true... Forgot these cases while I was running KVM. Check if that VM is using a compute offering which is marked as "HA enabled" - and if YES< then Wei is 100% right (you can confirm this from logs - checking for info on starting that VM on specific hypervisor etc) THough, IF doing live migration, I assume it should play fair/nice with HA and HA should not kick in.
Wei can you confirm if these 2 play together nice ^^^ ? Cheers On Wed, 30 Oct 2019 at 13:11, Wei ZHOU <ustcweiz...@gmail.com> wrote: > Hi Rakesh, > > The duplicated VM is not caused by migration, but by HA. > > -Wei > > On Wed, 30 Oct 2019 at 11:31, Rakesh Venkatesh <www.rakeshv....@gmail.com> > wrote: > > > Hi Andrija > > > > > > Sorry for the late reply. > > > > Im using 4.7 version of ACS. Qemu version 1:2.5+dfsg-5ubuntu10.40 > > > > Im not sure if ACS job failed or libvirt job as I didnt see into logs. > > Yes the vm will be in paused state during migration but after the failed > > migration, the same vm was in "running" state on two different > hypervisors. > > We wrote a script to find out how duplicated vm's are running and found > out > > that more than 5 vm's had this issue. > > > > > > On Mon, Oct 28, 2019 at 2:42 PM Andrija Panic <andrija.pa...@gmail.com> > > wrote: > > > > > I've been running KVM public cloud up to recently and have never seen > > such > > > behaviour. > > > > > > What versions (ACS, qemu, libvrit) are you running? > > > > > > How does the migration fail - ACS job - or libvirt job? > > > destination VM is by default always in PAUSED state, until the > migration > > is > > > finished - only then the destination VM (on the new host) will get > > RUNNING, > > > while previously pausing the original VM (on the old host). > > > > > > i,e. > > > phase1 source vm RUNNING, destination vm PAUSED (RAM content being > > > copied over... takes time...) > > > phase2 source vm PAUSED, destination vm PAUSED (last bits of RAM > > > content are migrated) > > > phase3 source vm destroyed, destination VM RUNNING. > > > > > > Andrija > > > > > > On Mon, 28 Oct 2019 at 14:26, Rakesh Venkatesh < > > www.rakeshv....@gmail.com> > > > wrote: > > > > > > > Hello Users > > > > > > > > > > > > Recently we have seen cases where when the Vm migration fails, > > cloudstack > > > > ends up running two instances of the same VM on different > hypervisors. > > > The > > > > state will be "running" and not any other transition state. This will > > of > > > > course lead to corruption of disk. Does CloudStack has any option of > > > volume > > > > locking so that two instances of the same VM wont be running? > > > > Anyone else has faced this issue and found some solution to fix it? > > > > > > > > We are thinking of using "virtlockd" of libvirt or implementing > custom > > > lock > > > > mechanisms. There are some pros and cons of the both the solutions > and > > i > > > > want your feedback before proceeding further. > > > > > > > > -- > > > > Thanks and regards > > > > Rakesh venkatesh > > > > > > > > > > > > > -- > > > > > > Andrija Panić > > > > > > > > > -- > > Thanks and regards > > Rakesh venkatesh > > > -- Andrija Panić