RRSBA for guest policies

Andrew Cooper Thu, 15 Jun 2023 03:42:23 -0700

On 15/06/2023 9:30 am, Jan Beulich wrote:
> On 14.06.2023 20:12, Andrew Cooper wrote:
>> On 13/06/2023 10:59 am, Jan Beulich wrote:
>>> On 12.06.2023 18:13, Andrew Cooper wrote:
>>>> The RSBA bit, "RSB Alternative", means that the RSB may use alternative
>>>> predictors when empty.  From a practical point of view, this mean 
>>>> "Retpoline
>>>> not safe".
>>>>
>>>> Enhanced IBRS (officially IBRS_ALL in Intel's docs, previously IBRS_ATT) 
>>>> is a
>>>> statement that IBRS is implemented in hardware (as opposed to the form
>>>> retrofitted to existing CPUs in microcode).
>>>>
>>>> The RRSBA bit, "Restricted-RSBA", is a combination of RSBA, and the eIBRS
>>>> property that predictions are tagged with the mode in which they were 
>>>> learnt.
>>>> Therefore, it means "when eIBRS is active, the RSB may fall back to
>>>> alternative predictors but restricted to the current prediction mode".  As
>>>> such, it's stronger statement than RSBA, but still means "Retpoline not 
>>>> safe".
>>>>
>>>> CPUs are not expected to enumerate both RSBA and RRSBA.
>>>>
>>>> Add feature dependencies for EIBRS and RRSBA.  While technically they're 
>>>> not
>>>> linked, absolutely nothing good can come of letting the guest see RRSBA
>>>> without EIBRS.  Nor a guest seeing EIBRS without IBRSB.  Furthermore, we 
>>>> use
>>>> this dependency to simplify the max derivation logic.
>>>>
>>>> The max policies gets RSBA and RRSBA unconditionally set (with the EIBRS
>>>> dependency maybe hiding RRSBA).  We can run any VM, even if it has been 
>>>> told
>>>> "somewhere you might run, Retpoline isn't safe".
>>>>
>>>> The default policies are more complicated.  A guest shouldn't see both 
>>>> bits,
>>>> but it needs to see one if the current host suffers from any form of RSBA, 
>>>> and
>>>> which bit it needs to see depends on whether eIBRS is visible or not.
>>>> Therefore, the calculation must be performed after sanitise_featureset().
>>>>
>>>> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
>>>> ---
>>>> CC: Jan Beulich <jbeul...@suse.com>
>>>> CC: Roger Pau Monné <roger....@citrix.com>
>>>> CC: Wei Liu <w...@xen.org>
>>>>
>>>> v3:
>>>>  * Minor commit message adjustment.
>>>>  * Drop changes to recalculate_cpuid_policy().  Deferred to a later series.
>>> With this dropped, with the title not saying "max/default", and with
>>> the description also not mentioning "live" policies at all, I don't
>>> think this patch is self-consistent (meaning in particular: leaving
>>> aside the fact that there's no way right now to requests e.g. both
>>> RSBA and RRSBA for a guest; aiui it is possible for Dom0).
>>>
>>> As you may imagine I'm also curious why you decided to drop this.
>> Because when I tried doing levelling in Xapi, I remembered why I did it
>> the way I did in v1, and why the v2 way was wrong.
>>
>> Xen cannot safely edit what the toolstack provides, so must not. 
> And this is the part I don't understand: Why can't we correct the
> (EIBRS,RSBA,RRSBA) tuple to a combination that is "legal"? At least
> as long as ...
>
>> Instead, failing the set_policy() call is an option, and is what we want
>> to do longterm,
> ... we aren't there.
>
>> but also happens to be wrong too in this case. An admin
>> may know that a VM isn't using retpoline, and may need to migrate it
>> anyway for a number of reasons, so any safety checks need to be in the
>> toolstack, and need to be overrideable with something like --force.
> Possibly leading to an inconsistent policy exposed to a guest? I
> guess this may be the only option when we can't really resolve an
> ambiguity, but that isn't the case here, is it?


Wrong.  Xen does not have any knowledge of other hosts the VM might
migrate to.

So while Xen can spot problem combinations *on this host*, which way to
correct the problem combination depends on where the VM might migrate to.

Xen cannot safely correct a problem combination even if you don't wish
to allow the admin the ability to override the safety check.

>
>> I don't really associate "derive policies" with anything other than the
>> system policies.  Domain construction isn't any kind of derivation -
>> it's simply doing what the toolstack asks.
> Hmm, I see. To me, since we do certain adjustments, "derive" still
> fits there as well. But I'm not going to insist on a subject
> adjustment then, given that imo both ways of looking at things make
> some sense.

It's a problem that Xen ever made adjustments behind the toolstack's
back, and this decade of technical debt has been extremely difficult to
address.  I guess I still view it in terms of the end properties, not
the intermediate mess.

~Andrew

Re: [PATCH v3 4/4] x86/cpu-policy: Derive RSBA/RRSBA for guest policies

Reply via email to