On 04/09/2020 07:55, Jan Beulich wrote:
> On 03.09.2020 23:36, Andrew Cooper wrote:
>> There are multiple bugs with the existing implementation, including incorrect
>> comments.
>>
>> On AMD CPUs prior to Zen2, loading a NUL segment selector doesn't clear the
>> segment base, which is a problem for 64bit code which typically expects to 
>> use
>> a NUL %fs/%gs selector.
>>
>> On a context switch from any PV vcpu, to a 64bit PV vcpu with an %fs/%gs
>> selector which faults, the fixup logic loads NUL, and the guest is entered at
>> the failsafe callback with the stale base.
>>
>> Alternatively, a PV context switch sequence of 64 (NUL, non-zero base) =>
>> 32 (NUL) => 64 (NUL, zero base) will similarly cause Xen to enter the guest
>> with a stale base.
>>
>> Both of these corner cases manifest as state corruption in the final vcpu.
>> However, damage is limited to to 64bit code expecting to use Thread Local
>> Storage with a base pointer of 0, which doesn't occur by default.
>>
>> The context switch logic is extremely complicated, and is attempting to
>> optimise away loading a NUL selector (which is fast), or writing a 64bit base
>> of 0 (which is rare).  Furthermore, it fails to respect Linux's ABI with
>> userspace, which manifests as userspace state corruption as far as Linux is
>> concerned.
>>
>> Always save and restore all selector and base state, in all cases.
>>
>> Leave a large comment explaining hardware behaviour, and the new ABI
>> expectations.  Update the comments in the public headers.
>>
>> Drop all "segment preloading" to handle the AMD corner case.  It was never
>> anything but a waste of time for %ds/%es, and isn't needed now that %fs/%gs
>> bases are unconditionally written for 64bit PV guests.  In load_segments(),
>> store the result of is_pv_32bit_vcpu() as it is an expensive predicate now,
>> and not used in a way which impacts speculative safety.
>>
>> Reported-by: Andy Lutomirski <l...@kernel.org>
>> Reported-by: Sarah Newman <s...@prgmr.com>
>> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
> Reviewed-by: Jan Beulich <jbeul...@suse.com>

I'm afraid I've found further bugs an ABI work to do.  v2 coming shortly.

~Andrew

Reply via email to