Let me check. We had the same problem on RHEL/CentOS but I am not sure if this a bug. What I know there was a change in the XML. Let me ask one on my colleges in my team.
😉 __ Sven Vogel Senior Manager Research and Development - Cloud and Infrastructure EWERK DIGITAL GmbH Brühl 24, D-04109 Leipzig P +49 341 42649 - 99 F +49 341 42649 - 98 s.vo...@ewerk.com www.ewerk.com Geschäftsführer: Dr. Erik Wende, Hendrik Schubert, Tassilo Möschke Registergericht: Leipzig HRB 9065 Support: +49 341 42649 555 Zertifiziert nach: ISO/IEC 27001:2013 DIN EN ISO 9001:2015 DIN ISO/IEC 20000-1:2018 ISAE 3402 Typ II Assessed EWERK-Blog<https://blog.ewerk.com/> | LinkedIn<https://www.linkedin.com/company/ewerk-group> | Xing<https://www.xing.com/company/ewerk> | Twitter<https://twitter.com/EWERK_Group> | Facebook<https://de-de.facebook.com/EWERK.Group/> Auskünfte und Angebote per Mail sind freibleibend und unverbindlich. Disclaimer Privacy: Der Inhalt dieser E-Mail (einschließlich etwaiger beigefügter Dateien) ist vertraulich und nur für den Empfänger bestimmt. Sollten Sie nicht der bestimmungsgemäße Empfänger sein, ist Ihnen jegliche Offenlegung, Vervielfältigung, Weitergabe oder Nutzung des Inhalts untersagt. Bitte informieren Sie in diesem Fall unverzüglich den Absender und löschen Sie die E-Mail (einschließlich etwaiger beigefügter Dateien) von Ihrem System. Vielen Dank. The contents of this e-mail (including any attachments) are confidential and may be legally privileged. If you are not the intended recipient of this e-mail, any disclosure, copying, distribution or use of its contents is strictly prohibited, and you should please notify the sender immediately and then delete it (including any attachments) from your system. Thank you. Von: Gabriel Bräscher <gabrasc...@gmail.com> Datum: Dienstag, 7. Dezember 2021 um 09:57 An: dev <dev@cloudstack.apache.org> Betreff: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04 Wei, I agree. This is not necessarily a bug per se. The main point here is: the issue we are seeing is the "bug #1887490" raised in Ubuntu's qemu package. CPU features were added on the newer releases, which caused the compatibility issue when (live) migrating VMs between compatible hardware but different qemu packages. On Tue, Dec 7, 2021 at 9:26 AM Wei ZHOU <ustcweiz...@gmail.com> wrote: > Hi Gabriel, > > In my opinion, migration should work from lower version to higher version, > but no guarantee from higher version to lower version, like we upgrade > cloudstack. > Therefore, migrate should work from ubuntu 18.04 to ubuntu 20.04. But it is > not a bug if migration fails from ubuntu 20.04 to ubuntu 18.04. > > As Paul said, migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is > definitely a bug in my point of view. > > -Wei > > On Mon, 6 Dec 2021 at 16:05, Gabriel Bräscher <gabrasc...@gmail.com> > wrote: > > > Hi Paul (& all), > > > > I strongly believe that this is a bug in QEMU. > > I was looking for bugs and found something that looks related to what we > > are seeing. Precisely at Ubuntu's bug #*1887490* > > <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490>: > > https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490 > > > > In the link above, there was the following comment: > > https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490/comments/53 > > > > It seems one of the patches also introduced a regression:* > > lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patchadds various > > SVM-related flags. Specifically npt and nrip-save are now expected to be > > present by default as shown in the updated testdata.This however breaks > > migration from instances using *EPYC* or *EPYC-IBPB* CPU models started > > with libvirt versions prior to this one because the instance on the > target > > host has these extra flags > > > > > > More about #*1887490* > > <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490> can be > found > > at the mail > > > https://www.mail-archive.com/ubuntu-bugs@lists.ubuntu.com/msg5842376.html. > > We can see that the specific bug was addressed in "linux (5.4.0-49.53) > > focal". > > > > linux (5.4.0-49.53) focal; urgency=medium > > > > * Add/Backport EPYC-v3 and EPYC-Rome CPU model (LP: #1887490) > > - kvm: svm: Update svm_xsaves_supported > > > > > > Regards, > > Gabriel. > > > > On Fri, Dec 3, 2021 at 10:59 AM Paul Angus <paul.an...@ticketmaster.com> > > wrote: > > > > > Which version(s) of QEMU are you using Wido? > > > > > > We've just be upgrading CentOS 7.6 to 7.9 > > > Most 7.6 hosts had qemu-ev 2.10 on it (the buggy one). 2.12 was on the > > > new hosts. > > > We were getting errors complaining that the ibpb CPU feature wasn't > > > available when migrating to the updated OS hosts (even though identical > > > hardware). > > > > > > Upgrading qemu-ev to 2.12 on the originating host, then stopping and > > > starting the VMs, then allowed us to migrate. We couldn't find any > > > solution that didn't involve stopping and starting the VMs. > > > > > > Paul. > > > > > > -----Original Message----- > > > From: Wido den Hollander <w...@widodh.nl> > > > Sent: Monday, November 29, 2021 7:57 AM > > > To: dev@cloudstack.apache.org; Wei ZHOU <ustcweiz...@gmail.com> > > > Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04 > > > > > > > > > > > > On 11/24/21 10:36 PM, Wei ZHOU wrote: > > > > Hi Wido, > > > > > > > > I think it is not good to run an environment with two ubuntu/qemu > > > versions. > > > > It always happens that some cpu features are supported in the higher > > > > version but not supported in the older version. > > > > From my experience, the migration from older version to higher > version > > > > works like a charm, but there were many issues in migration from > > > > higher version to older version. > > > > > > > > > > I understand. But with a large amount of hosts and working your way > > > through upgrades you sometimes run into these situations. Therefor it > > would > > > be welcome if it works. > > > > > > > I do not have a solution for you. I have tried to hack > > > > /etc/libvirt/hooks/qemu but it didn't work. > > > > Have you tried with other cpu models like x86_Opteron_G5 ? you can > > > > find the cpu features of each cpu model in > /usr/share/libvirt/cpu_map/ > > > > > > > > > > I have not tried that yet, but I can see if that works. > > > > > > The EPYC-IBPB CPU model is identical on 18.04 and 20.04, but even using > > > that model we can't seem to migrate as it complains about the 'npt' > > feature. > > > > > > Wido > > > > > > > Anyway, even if the vm migration succeeds, you do not know if vm > works > > > > fine. I believe the best solution is upgrading all hosts to the same > > > > OS version. > > > > > > > > -Wei > > > > > > > > On Tue, 23 Nov 2021 at 16:31, Wido den Hollander <w...@widodh.nl> > > wrote: > > > > > > > >> Hi, > > > >> > > > >> I'm trying to debug an issue with live migrations between Ubuntu > > > >> 18.04 and 20.04 machines each with different CPUs: > > > >> > > > >> - Ubuntu 18.04 with AMD Epyc 7552 (Rome) > > > >> - Ubuntu 20.04 with AMD Epyc 7662 (Milan) > > > >> > > > >> We are currently using this setting: > > > >> > > > >> guest.cpu.mode=custom > > > >> guest.cpu.model=EPYC > > > >> > > > >> This does not allow for live migrations: > > > >> > > > >> Ubuntu 20.04 with Epyc 7662 to Ubuntu 18.04 with Epyc 7552 fails > > > >> > > > >> "ExecutionException : org.libvirt.LibvirtException: unsupported > > > >> configuration: unknown CPU feature: npt" > > > >> > > > >> So we tried to define a set of features manually: > > > >> > > > >> guest.cpu.features=3dnowprefetch abm adx aes apic arat avx avx2 bmi1 > > > >> bmi2 clflush clflushopt cmov cr8legacy cx16 cx8 de f16c fma fpu > > > >> fsgsbase fxsr fxsr_opt lahf_lm lm mca mce misalignsse mmx mmxext > > > >> monitor movbe msr mtrr nx osvw pae pat pclmuldq pdpe1gb pge pni > > > >> popcnt pse pse36 rdrand rdseed rdtscp sep sha-ni smap smep sse sse2 > > > >> sse4.1 sse4.2 sse4a > > > >> ssse3 svm syscall tsc vme xgetbv1 xsave xsavec xsaveopt -npt -x2apic > > > >> -hypervisor -topoext -nrip-save > > > >> > > > >> This results in this going into the XML: > > > >> > > > >> <feature policy='disable' name='npt'/> > > > >> > > > >> You would say that works, but then the target host (18.04 with the > > > >> 7552) says it doesn't support the feature 'npt' and the migration > > still > > > fails. > > > >> > > > >> Now we could ofcourse use the kvm64 CPU from Qemu, but that's > lacking > > > >> so many features that for example TLS offloading isn't available. > > > >> > > > >> I also tried to set 'EPYC-Rome' on the Ubuntu 20.04 hypervisor, but > > > >> it then complains on the Ubuntu 18.04 hypervisor that the CPU > > > 'EPYC-Rome' > > > >> is unknown as the 18.04 hypervisor doesn't have that profile. > > > >> > > > >> Any ideas on how to get this working? > > > >> > > > >> Wido > > > >> > > > > > > > This message is confidential and may be legally privileged or otherwise > > > protected from disclosure. If you are not the intended recipient, > please > > > telephone or email the sender and delete this message and any > attachment > > > from your system; you must not copy or disclose the contents of this > > > message or any attachment to any other person. We may monitor email > > traffic > > > and the content of internal and external messages sent to and from us > to > > > ensure compliance with internal policies and for the purposes of > > security. > > > > > > Ticketmaster UK Limited. Registered Office: 30 St John Street, London > > EC1M > > > 4AY. Registered in England and Wales. Company Number 02662632. > > > > > >