On 26/11/2019 16:05, Jan Beulich wrote:
> On 26.11.2019 16:59, Andrew Cooper wrote:
>> On 26/11/2019 15:32, Jan Beulich wrote:
>>> On 26.11.2019 13:03, Andrew Cooper wrote:
>>>> ICEBP isn't handled well by SVM.
>>>>
>>>> The VMexit state for a #DB-vectored TASK_SWITCH has %rip pointing to the
>>>> appropriate instruction boundary (fault or trap, as appropriate), except 
>>>> for
>>>> an ICEBP-induced #DB TASK_SWITCH, where %rip points at the ICEBP 
>>>> instruction
>>>> rather than after it.  As ICEBP isn't distinguished in the vectoring event
>>>> type, the state is ambiguous.
>>>>
>>>> To add to the confusion, an ICEBP which occurs due to Introspection
>>>> intercepting the instruction, or from x86_emulate() will have %rip updated 
>>>> as
>>>> a consequence of partial emulation required to inject an ICEBP event in the
>>>> first place.
>>>>
>>>> We could in principle spot the non-injected case in the TASK_SWITCH 
>>>> handler,
>>>> but this still results in complexity if the ICEBP instruction also has an
>>>> Instruction Breakpoint active on it (which genuinely has fault semantics).
>>>>
>>>> Unconditionally intercept ICEBP.  This does have a trap semantics for the
>>>> intercept, and allows us to move %rip forwards appropriately before the
>>>> TASK_SWITCH intercept is hit.
>>> Both because of you mentioning the moving forwards of %rip and with the
>>> irc discussion in mind that we had no irc, don't you mean "fault
>>> semantics" here?
>> ICEBP really is too broken under SVM to handle architecturally.
>>
>> The ICEBP intercept has nRIP decode support, because it is an
>> instruction intercept.  We emulate the injection (because it is ICEBP),
>> which means we re-enter the guest with %rip moved forward, and #DB
>> (HW_EXCEPTION) pending for injection.  This means that...
>>
>>>  If so
>>> Reviewed-by: Jan Beulich <jbeul...@suse.com>
>> ... the ICEBP-#DB-vectored TASK_SWITCH will now find %rip pointing after
>> the ICEBP instruction, rather than at it, making it consistent with
>> every other #DB-vectored TASK_SWITCH.
>>
>> This does means that an early task-switch fault for ICEBP will reliably
>> be delivered with the wrong (i.e. trap) semantics, but this is less bad
>> than mixed fault/trap semantics depending on whether the source of the
>> ICEBP was introspection/emulation or native execution.
>>
>> We could restore proper fault behaviour by extending
>> svm_emul_swint_injection() to figure out that a task switch is needed,
>> and invoke hvm_task_switch() directly, but I don't have enough TUITS
>> right now.
>>
>>> Otherwise I guess I'm still missing something.
>> I hope this clears it up.
> Well, it helps, but you don't really answer the question: Is "trap"
> in that sentence of the description really correct? I.e. don't you
> instead mean "fault" there?

I've reworded that bit to:

Unconditionally intercept ICEBP.  This does have NRIPs support as it is an
instruction intercept, which allows us allows us to move %rip forwards
appropriately before the TASK_SWITCH intercept is hit.  This allows...

Any better?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to