On Mon, Jun 2, 2025 at 3:23 PM Jiří Denemark <jdene...@redhat.com> wrote:
> On Mon, Jun 02, 2025 at 14:30:43 +0200, Hector Cao wrote: > > Hello Jiri, > > > > Thanks for the feedback, > > > > On Mon, Jun 2, 2025 at 9:30 AM Jiri Denemark <jdene...@redhat.com> > wrote: > > > > > On Mon, Jun 02, 2025 at 01:19:29 +0200, Hector Cao wrote: > > > > Several Intel CPU models with TSX technology (HLE & RTM features) are > > > > affected by the vulnerability TAA[1]. One of the mitigation methods > > > > for TAA is to disable TSX support on the host system. For that > purpose, > > > > in 2021, Intel published a microcode update to disable TSX. Linux > kernel > > > > also disables TSX globally by default. Even though TSX can be > activated > > > via > > > > the kernel command line (tsx=on), many Linux distributions stick with > > > > this default behavior and have TSX disabled. This makes existing CPU > > > > models that have HLE and RTM enabled not correctly detected by > > > > libvirt. > > > > > > Can you describe the issue in more details? Especially where libvirt > > > incorrectly detects CPU models because of this? > > > > > > > > On my platform (Granite Rapids CPU) with TSX disabled by default in the > > kernel > > The TSX features rtm and hle are missing, per consequence, `virsh > > capabilities` detects the CPU as > > Icelake-Server-noTSX model. > > I see, I was thinking this was the case. The CPU definition provided in > host capabilities is limited and cannot cover CPUs that lack some > features compared to the corresponding CPU model and a simpler CPU model > has to be shown instead. Thus this information is mostly useless (except > for checking what exact features a host CPU supports) and it's not used > for anything by libvirt itself. And since we have a much better way of > describing the host CPU or rather a CPU that can be provided to a guest > on the host (virsh domcapabilities --xpath > "//cpu/mode[@name='host-model']") > there's no reason other applications or users should look at the CPU in > virsh capabilities either. It's similar to how cpu/topology element in > virsh capabilities is useless and should not be used. > > So except for not having the right CPU model in the capabilities XML > (which is not a bug, but rather a known limitation), is there any other > issue? I believe the host CPU would be correctly reported as > SapphireRapids/GraniteRapids with both hle and rtm disabled in domain > capabilities XML. > > Yes, you are right, if rtm and hle features are available, Granite Rapids will be correctly reported by virsh capabilities if the MSR bug is fixed (please take a look at : https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/XNOHU7PODTZVCX7ZQ2PBM7DRQRG2D6C7/ ) You are also right that this is not a bug but rather a known limitation. However, we are getting regular bug reports from users who are not aware of this known limitation and are confused. I would think if we can offer a better experience and save time for everyone, It might be worth the effort, especially GraniteRapids would be the last CPU model affected by this issue. If you still believe that this little effort is not useful, I would think that we can tackle this issue by offering better documentation about this known limitation. What do you think ? We are thinking about documenting it on Ubuntu but do you think that we can do something more upstream ? Thanks ! > > > This commit adds 2 remaining -noTSX models: > > > > - SapphireRapids-noTSX > > > > - GraniteRapids-noTSX > > > > > > QEMU switched away from adding suffixes to CPU models and just adds a > > > new version for a CPU model in case it needs to be updated. There's no > > > point adding these models to libvirt. Any CPU model that would only > > > exist in libvirt would not be directly usable anyway and would have to > > > be translated to another CPU model. > > > > > > > I would be grateful if you can provide me some background on what is the > > criteria to add a > > new version to an existing model. For the case of Intel, how do we know > > that we need to > > add a new version to the CPU model ? > > I don't know, you'd need to ask QEMU developers. > > > Beyond the naming issue (version vs suffix), I understand that we stopped > > doing what we did for older CPU models > > like this commit for Icelake, do I understand it correctly ? > > > > i386: Add -noTSX aliases for hle=off, rtm=off CPU models > > > https://github.com/qemu/qemu/commit/02fa60d10137ed2ef17534718d7467e0d2170142 > > This was the original approach for creating modified CPU models that can > be used as-is without having to manually specify bunch of features. But > when more cases appeared they realized such approach didn't scale and > switched to versioned CPU models with -v* suffixes instead. > > > Do you think that adding a new version for Sapphire and Granite Rapids > > CPU models both in QEMU and libvirt would be something that makes > > sense to tackle this issue ? > > Well, you can try asking whether adding such CPU model in QEMU would > make sense. From libvirt's POV this is just a cosmetic issue so not > worth the effort IMHO. > > Jirka > > -- Hector CAO Software Engineer – Partner Engineering Team hector....@canonical.com https://launc <https://launchpad.net/~hectorcao>hpad.net/~hectorcao <https://launchpad.net/~hectorcao> <https://launchpad.net/~hectorcao>