Paul, I confused the issues then. The one I mentioned fits only with what Wido reported in this thread. The CPU flag matches with the ones raised on that bug. Flags like *npt* & *nrip-save* which are present when SVM is enabled. Therefore, affected by kernel commit -- 52297436199d ("kvm: svm: Update svm_xsaves_supported"). Additionally, the OS/Qemu versions also do fit with what is reported on Ubuntu' qemu package "bug #1887490".
Regards On Tue, Dec 7, 2021 at 12:10 PM Paul Angus <p...@angus.uk.com.invalid> wrote: > The qemu-ev 2.10 bug was first reported a year or two ago in the mailing > lists. > > -----Original Message----- > From: Gabriel Bräscher <gabrasc...@gmail.com> > Sent: Tuesday, December 7, 2021 9:41 AM > To: dev <dev@cloudstack.apache.org> > Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 and 20.04 > > Just adding to the "qemu-ev 2.10" & "qemu-ev 2.12" point. > > > migration fails from qemu-ev 2.10 to qemu-ev 2.12, this is definitely > > a bug in my point of view. > > > > On the comment 53 (at "bug #1887490"): > > > It seems *one of the patches also introduced a regression*: > > * lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patch > > adds various SVM-related flags. Specifically *npt and nrip-save are > > now expected to be present by default* as shown in the updated testdata. > > This however breaks migration from instances using EPYC or EPYC-IBPB > > CPU models started with libvirt versions prior to this one because the > > instance on the target host has these extra flags > > > From the tests reported there, it fails in both ways. > 1. From *older* qemu package to *newer*: > *source* host does not map the CPU flag; however, *target* host > expects the flag to be there, by default. > 2. From *newer* qemu package to *older*: > the instance "domain.xml" in the *source* host has a CPU flag that is > not mapped by qemu in the *target* host. > > > > On Tue, Dec 7, 2021 at 10:22 AM Sven Vogel <s.vo...@ewerk.com> wrote: > > > Let me check. We had the same problem on RHEL/CentOS but I am not sure > > if this a bug. What I know there was a change in the XML. Let me ask > > one on my colleges in my team. > > > > 😉 > > > > > > __ > > > > Sven Vogel > > Senior Manager Research and Development - Cloud and Infrastructure > > > > EWERK DIGITAL GmbH > > Brühl 24, D-04109 Leipzig > > P +49 341 42649 - 99 > > F +49 341 42649 - 98 > > s.vo...@ewerk.com > > www.ewerk.com > > > > Geschäftsführer: > > Dr. Erik Wende, Hendrik Schubert, Tassilo Möschke > > Registergericht: Leipzig HRB 9065 > > > > Support: > > +49 341 42649 555 > > > > Zertifiziert nach: > > ISO/IEC 27001:2013 > > DIN EN ISO 9001:2015 > > DIN ISO/IEC 20000-1:2018 > > > > ISAE 3402 Typ II Assessed > > > > EWERK-Blog<https://blog.ewerk.com/> | LinkedIn< > > https://www.linkedin.com/company/ewerk-group> | Xing< > > https://www.xing.com/company/ewerk> | Twitter< > > https://twitter.com/EWERK_Group> | Facebook< > > https://de-de.facebook.com/EWERK.Group/> > > > > > > Auskünfte und Angebote per Mail sind freibleibend und unverbindlich. > > > > Disclaimer Privacy: > > Der Inhalt dieser E-Mail (einschließlich etwaiger beigefügter Dateien) > > ist vertraulich und nur für den Empfänger bestimmt. Sollten Sie nicht > > der bestimmungsgemäße Empfänger sein, ist Ihnen jegliche Offenlegung, > > Vervielfältigung, Weitergabe oder Nutzung des Inhalts untersagt. Bitte > > informieren Sie in diesem Fall unverzüglich den Absender und löschen > > Sie die E-Mail (einschließlich etwaiger beigefügter Dateien) von Ihrem > System. > > Vielen Dank. > > > > The contents of this e-mail (including any attachments) are > > confidential and may be legally privileged. If you are not the > > intended recipient of this e-mail, any disclosure, copying, > > distribution or use of its contents is strictly prohibited, and you > > should please notify the sender immediately and then delete it > (including any attachments) from your system. Thank you. > > Von: Gabriel Bräscher <gabrasc...@gmail.com> > > Datum: Dienstag, 7. Dezember 2021 um 09:57 > > An: dev <dev@cloudstack.apache.org> > > Betreff: Re: Live migration between AMD Epyc and Ubuntu 18.04 and > > 20.04 Wei, I agree. > > This is not necessarily a bug per se. > > > > The main point here is: the issue we are seeing is the "bug #1887490" > > raised in Ubuntu's qemu package. > > CPU features were added on the newer releases, which caused the > > compatibility issue when (live) migrating VMs between compatible > > hardware but different qemu packages. > > > > > > On Tue, Dec 7, 2021 at 9:26 AM Wei ZHOU <ustcweiz...@gmail.com> wrote: > > > > > Hi Gabriel, > > > > > > In my opinion, migration should work from lower version to higher > > version, > > > but no guarantee from higher version to lower version, like we > > > upgrade cloudstack. > > > Therefore, migrate should work from ubuntu 18.04 to ubuntu 20.04. > > > But it > > is > > > not a bug if migration fails from ubuntu 20.04 to ubuntu 18.04. > > > > > > As Paul said, migration fails from qemu-ev 2.10 to qemu-ev 2.12, > > > this is definitely a bug in my point of view. > > > > > > -Wei > > > > > > On Mon, 6 Dec 2021 at 16:05, Gabriel Bräscher <gabrasc...@gmail.com> > > > wrote: > > > > > > > Hi Paul (& all), > > > > > > > > I strongly believe that this is a bug in QEMU. > > > > I was looking for bugs and found something that looks related to > > > > what > > we > > > > are seeing. Precisely at Ubuntu's bug #*1887490* > > > > <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490>: > > > > https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490 > > > > > > > > In the link above, there was the following comment: > > > > > > https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490/comments/5 > > 3 > > > > > > > > It seems one of the patches also introduced a regression:* > > > > lp-1887490-cpu_map-Add-missing-AMD-SVM-features.patchadds various > > > > SVM-related flags. Specifically npt and nrip-save are now expected > > > > to > > be > > > > present by default as shown in the updated testdata.This however > > > > breaks migration from instances using *EPYC* or *EPYC-IBPB* CPU > > > > models started with libvirt versions prior to this one because the > > > > instance on the > > > target > > > > host has these extra flags > > > > > > > > > > > > More about #*1887490* > > > > <https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1887490> can > > > > be > > > found > > > > at the mail > > > > > > > > > > https://www.mail-archive.com/ubuntu-bugs@lists.ubuntu.com/msg5842376.html. > > > > We can see that the specific bug was addressed in "linux > > > > (5.4.0-49.53) focal". > > > > > > > > linux (5.4.0-49.53) focal; urgency=medium > > > > > > > > * Add/Backport EPYC-v3 and EPYC-Rome CPU model (LP: #1887490) > > > > - kvm: svm: Update svm_xsaves_supported > > > > > > > > > > > > Regards, > > > > Gabriel. > > > > > > > > On Fri, Dec 3, 2021 at 10:59 AM Paul Angus < > > paul.an...@ticketmaster.com> > > > > wrote: > > > > > > > > > Which version(s) of QEMU are you using Wido? > > > > > > > > > > We've just be upgrading CentOS 7.6 to 7.9 Most 7.6 hosts had > > > > > qemu-ev 2.10 on it (the buggy one). 2.12 was on > > the > > > > > new hosts. > > > > > We were getting errors complaining that the ibpb CPU feature > > > > > wasn't available when migrating to the updated OS hosts (even > > > > > though > > identical > > > > > hardware). > > > > > > > > > > Upgrading qemu-ev to 2.12 on the originating host, then stopping > > > > > and starting the VMs, then allowed us to migrate. We couldn't > > > > > find any solution that didn't involve stopping and starting the > VMs. > > > > > > > > > > Paul. > > > > > > > > > > -----Original Message----- > > > > > From: Wido den Hollander <w...@widodh.nl> > > > > > Sent: Monday, November 29, 2021 7:57 AM > > > > > To: dev@cloudstack.apache.org; Wei ZHOU <ustcweiz...@gmail.com> > > > > > Subject: Re: Live migration between AMD Epyc and Ubuntu 18.04 > > > > > and > > 20.04 > > > > > > > > > > > > > > > > > > > > On 11/24/21 10:36 PM, Wei ZHOU wrote: > > > > > > Hi Wido, > > > > > > > > > > > > I think it is not good to run an environment with two > > > > > > ubuntu/qemu > > > > > versions. > > > > > > It always happens that some cpu features are supported in the > > higher > > > > > > version but not supported in the older version. > > > > > > From my experience, the migration from older version to higher > > > version > > > > > > works like a charm, but there were many issues in migration > > > > > > from higher version to older version. > > > > > > > > > > > > > > > > I understand. But with a large amount of hosts and working your > > > > > way through upgrades you sometimes run into these situations. > > > > > Therefor it > > > > would > > > > > be welcome if it works. > > > > > > > > > > > I do not have a solution for you. I have tried to hack > > > > > > /etc/libvirt/hooks/qemu but it didn't work. > > > > > > Have you tried with other cpu models like x86_Opteron_G5 ? you > > > > > > can find the cpu features of each cpu model in > > > /usr/share/libvirt/cpu_map/ > > > > > > > > > > > > > > > > I have not tried that yet, but I can see if that works. > > > > > > > > > > The EPYC-IBPB CPU model is identical on 18.04 and 20.04, but > > > > > even > > using > > > > > that model we can't seem to migrate as it complains about the 'npt' > > > > feature. > > > > > > > > > > Wido > > > > > > > > > > > Anyway, even if the vm migration succeeds, you do not know if > > > > > > vm > > > works > > > > > > fine. I believe the best solution is upgrading all hosts to > > > > > > the > > same > > > > > > OS version. > > > > > > > > > > > > -Wei > > > > > > > > > > > > On Tue, 23 Nov 2021 at 16:31, Wido den Hollander > > > > > > <w...@widodh.nl> > > > > wrote: > > > > > > > > > > > >> Hi, > > > > > >> > > > > > >> I'm trying to debug an issue with live migrations between > > > > > >> Ubuntu > > > > > >> 18.04 and 20.04 machines each with different CPUs: > > > > > >> > > > > > >> - Ubuntu 18.04 with AMD Epyc 7552 (Rome) > > > > > >> - Ubuntu 20.04 with AMD Epyc 7662 (Milan) > > > > > >> > > > > > >> We are currently using this setting: > > > > > >> > > > > > >> guest.cpu.mode=custom > > > > > >> guest.cpu.model=EPYC > > > > > >> > > > > > >> This does not allow for live migrations: > > > > > >> > > > > > >> Ubuntu 20.04 with Epyc 7662 to Ubuntu 18.04 with Epyc 7552 > > > > > >> fails > > > > > >> > > > > > >> "ExecutionException : org.libvirt.LibvirtException: > > > > > >> unsupported > > > > > >> configuration: unknown CPU feature: npt" > > > > > >> > > > > > >> So we tried to define a set of features manually: > > > > > >> > > > > > >> guest.cpu.features=3dnowprefetch abm adx aes apic arat avx > > > > > >> avx2 > > bmi1 > > > > > >> bmi2 clflush clflushopt cmov cr8legacy cx16 cx8 de f16c fma > > > > > >> fpu fsgsbase fxsr fxsr_opt lahf_lm lm mca mce misalignsse mmx > > > > > >> mmxext monitor movbe msr mtrr nx osvw pae pat pclmuldq > > > > > >> pdpe1gb pge pni popcnt pse pse36 rdrand rdseed rdtscp sep > > > > > >> sha-ni smap smep sse > > sse2 > > > > > >> sse4.1 sse4.2 sse4a > > > > > >> ssse3 svm syscall tsc vme xgetbv1 xsave xsavec xsaveopt -npt > > -x2apic > > > > > >> -hypervisor -topoext -nrip-save > > > > > >> > > > > > >> This results in this going into the XML: > > > > > >> > > > > > >> <feature policy='disable' name='npt'/> > > > > > >> > > > > > >> You would say that works, but then the target host (18.04 > > > > > >> with the > > > > > >> 7552) says it doesn't support the feature 'npt' and the > > > > > >> migration > > > > still > > > > > fails. > > > > > >> > > > > > >> Now we could ofcourse use the kvm64 CPU from Qemu, but that's > > > lacking > > > > > >> so many features that for example TLS offloading isn't > available. > > > > > >> > > > > > >> I also tried to set 'EPYC-Rome' on the Ubuntu 20.04 > > > > > >> hypervisor, > > but > > > > > >> it then complains on the Ubuntu 18.04 hypervisor that the CPU > > > > > 'EPYC-Rome' > > > > > >> is unknown as the 18.04 hypervisor doesn't have that profile. > > > > > >> > > > > > >> Any ideas on how to get this working? > > > > > >> > > > > > >> Wido > > > > > >> > > > > > > > > > > > This message is confidential and may be legally privileged or > > otherwise > > > > > protected from disclosure. If you are not the intended > > > > > recipient, > > > please > > > > > telephone or email the sender and delete this message and any > > > attachment > > > > > from your system; you must not copy or disclose the contents of > > > > > this message or any attachment to any other person. We may > > > > > monitor email > > > > traffic > > > > > and the content of internal and external messages sent to and > > > > > from us > > > to > > > > > ensure compliance with internal policies and for the purposes of > > > > security. > > > > > > > > > > Ticketmaster UK Limited. Registered Office: 30 St John Street, > > > > > London > > > > EC1M > > > > > 4AY. Registered in England and Wales. Company Number 02662632. > > > > > > > > > > > > > > >