On Mon, Jun 02, 2025 at 14:30:43 +0200, Hector Cao wrote:
> Hello Jiri,
> 
> Thanks for the feedback,
> 
> On Mon, Jun 2, 2025 at 9:30 AM Jiri Denemark <jdene...@redhat.com> wrote:
> 
> > On Mon, Jun 02, 2025 at 01:19:29 +0200, Hector Cao wrote:
> > > Several Intel CPU models with TSX technology (HLE & RTM features) are
> > > affected by the vulnerability TAA[1]. One of the mitigation methods
> > > for TAA is to disable TSX support on the host system. For that purpose,
> > > in 2021, Intel published a microcode update to disable TSX. Linux kernel
> > > also disables TSX globally by default. Even though TSX can be activated
> > via
> > > the kernel command line (tsx=on), many Linux distributions stick with
> > > this default behavior and have TSX disabled. This makes existing CPU
> > > models that have HLE and RTM enabled not correctly detected by
> > > libvirt.
> >
> > Can you describe the issue in more details? Especially where libvirt
> > incorrectly detects CPU models because of this?
> >
> >
> On my platform (Granite Rapids CPU) with TSX disabled by default in the
> kernel
> The TSX features rtm and hle are missing, per consequence, `virsh
> capabilities` detects the CPU as
> Icelake-Server-noTSX model.

I see, I was thinking this was the case. The CPU definition provided in
host capabilities is limited and cannot cover CPUs that lack some
features compared to the corresponding CPU model and a simpler CPU model
has to be shown instead. Thus this information is mostly useless (except
for checking what exact features a host CPU supports) and it's not used
for anything by libvirt itself. And since we have a much better way of
describing the host CPU or rather a CPU that can be provided to a guest
on the host (virsh domcapabilities --xpath "//cpu/mode[@name='host-model']")
there's no reason other applications or users should look at the CPU in
virsh capabilities either. It's similar to how cpu/topology element in
virsh capabilities is useless and should not be used.

So except for not having the right CPU model in the capabilities XML
(which is not a bug, but rather a known limitation), is there any other
issue? I believe the host CPU would be correctly reported as
SapphireRapids/GraniteRapids with both hle and rtm disabled in domain
capabilities XML.

> > > This commit adds 2 remaining -noTSX models:
> > > - SapphireRapids-noTSX
> > > - GraniteRapids-noTSX
> >
> > QEMU switched away from adding suffixes to CPU models and just adds a
> > new version for a CPU model in case it needs to be updated. There's no
> > point adding these models to libvirt. Any CPU model that would only
> > exist in libvirt would not be directly usable anyway and would have to
> > be translated to another CPU model.
> >
> 
> I would be grateful if you can provide me some background on what is the
> criteria to add a
> new version to an existing model. For the case of Intel, how do we know
> that we need to
> add a new version to the CPU model ?

I don't know, you'd need to ask QEMU developers.

> Beyond the naming issue (version vs suffix), I understand that we stopped
> doing what we did for older CPU models
> like this commit for Icelake, do I understand it correctly ?
> 
> i386: Add -noTSX aliases for hle=off, rtm=off CPU models
> https://github.com/qemu/qemu/commit/02fa60d10137ed2ef17534718d7467e0d2170142

This was the original approach for creating modified CPU models that can
be used as-is without having to manually specify bunch of features. But
when more cases appeared they realized such approach didn't scale and
switched to versioned CPU models with -v* suffixes instead.

> Do you think that adding a new version for Sapphire and Granite Rapids
> CPU models both in QEMU and libvirt would be something that makes
> sense to tackle this issue ?

Well, you can try asking whether adding such CPU model in QEMU would
make sense. From libvirt's POV this is just a cosmetic issue so not
worth the effort IMHO.

Jirka

Reply via email to