On 26/11/2019 16:05, Jan Beulich wrote: > On 26.11.2019 16:59, Andrew Cooper wrote: >> On 26/11/2019 15:32, Jan Beulich wrote: >>> On 26.11.2019 13:03, Andrew Cooper wrote: >>>> ICEBP isn't handled well by SVM. >>>> >>>> The VMexit state for a #DB-vectored TASK_SWITCH has %rip pointing to the >>>> appropriate instruction boundary (fault or trap, as appropriate), except >>>> for >>>> an ICEBP-induced #DB TASK_SWITCH, where %rip points at the ICEBP >>>> instruction >>>> rather than after it. As ICEBP isn't distinguished in the vectoring event >>>> type, the state is ambiguous. >>>> >>>> To add to the confusion, an ICEBP which occurs due to Introspection >>>> intercepting the instruction, or from x86_emulate() will have %rip updated >>>> as >>>> a consequence of partial emulation required to inject an ICEBP event in the >>>> first place. >>>> >>>> We could in principle spot the non-injected case in the TASK_SWITCH >>>> handler, >>>> but this still results in complexity if the ICEBP instruction also has an >>>> Instruction Breakpoint active on it (which genuinely has fault semantics). >>>> >>>> Unconditionally intercept ICEBP. This does have a trap semantics for the >>>> intercept, and allows us to move %rip forwards appropriately before the >>>> TASK_SWITCH intercept is hit. >>> Both because of you mentioning the moving forwards of %rip and with the >>> irc discussion in mind that we had no irc, don't you mean "fault >>> semantics" here? >> ICEBP really is too broken under SVM to handle architecturally. >> >> The ICEBP intercept has nRIP decode support, because it is an >> instruction intercept. We emulate the injection (because it is ICEBP), >> which means we re-enter the guest with %rip moved forward, and #DB >> (HW_EXCEPTION) pending for injection. This means that... >> >>> If so >>> Reviewed-by: Jan Beulich <jbeul...@suse.com> >> ... the ICEBP-#DB-vectored TASK_SWITCH will now find %rip pointing after >> the ICEBP instruction, rather than at it, making it consistent with >> every other #DB-vectored TASK_SWITCH. >> >> This does means that an early task-switch fault for ICEBP will reliably >> be delivered with the wrong (i.e. trap) semantics, but this is less bad >> than mixed fault/trap semantics depending on whether the source of the >> ICEBP was introspection/emulation or native execution. >> >> We could restore proper fault behaviour by extending >> svm_emul_swint_injection() to figure out that a task switch is needed, >> and invoke hvm_task_switch() directly, but I don't have enough TUITS >> right now. >> >>> Otherwise I guess I'm still missing something. >> I hope this clears it up. > Well, it helps, but you don't really answer the question: Is "trap" > in that sentence of the description really correct? I.e. don't you > instead mean "fault" there?
I've reworded that bit to: Unconditionally intercept ICEBP. This does have NRIPs support as it is an instruction intercept, which allows us allows us to move %rip forwards appropriately before the TASK_SWITCH intercept is hit. This allows... Any better? ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel