Hopefully I got the In-Reply-To header right...

On Thu, May 28, 2020, Paolo Bonzini wrote:
> This allows exceptions injected by the emulator to be properly delivered
> as vmexits.  The code also becomes simpler, because we can just let all
> L0-intercepted exceptions go through the usual path.  In particular, our
> emulation of the VMX #DB exit qualification is very much simplified,
> because the vmexit injection path can use kvm_deliver_exception_payload
> to update DR6.

Sadly, it's also completely and utterly broken for #UD and #GP, and a bit
sketchy for #AC.

Unless KVM (L0) knowingly wants to override L1, e.g. KVM_GUESTDBG_* cases, KVM
shouldn't do a damn thing except forward the exception to L1 if L1 wants the
exception.

ud_interception() and gp_interception() do quite a bit before forwarding the
exception, and in the case of #UD, it's entirely possible the #UD will never get
forwarded to L1.  #GP is even more problematic because it's a contributory
exception, and kvm_multiple_exception() is not equipped to check and handle
nested intercepts before vectoring the exception, which means KVM will
incorrectly escalate a #GP->#DF and #GP->#DF->Triple Fault instead of exiting
to L1.  That's a wee bit problematic since KVM also has a soon-to-be-fixed bug
where it kills L1 on a Triple Fault in L2...

I think this will fix the bugs, I'll properly test and post next week.

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 90a1704b5752..928e11646dca 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -926,11 +926,11 @@ static int nested_svm_intercept(struct vcpu_svm *svm)
        }
        case SVM_EXIT_EXCP_BASE ... SVM_EXIT_EXCP_BASE + 0x1f: {
                /*
-                * Host-intercepted exceptions have been checked already in
-                * nested_svm_exit_special.  There is nothing to do here,
-                * the vmexit is injected by svm_check_nested_events.
+                * Note, KVM may already have snagged exceptions it wants to
+                * handle even if L1 also wants the exception, e.g. #MC.
                 */
-               vmexit = NESTED_EXIT_DONE;
+               if (vmcb_is_intercept(&svm->nested.ctl, exit_code))
+                       vmexit = NESTED_EXIT_DONE;
                break;
        }
        case SVM_EXIT_ERR: {
@@ -1122,19 +1122,23 @@ int nested_svm_exit_special(struct vcpu_svm *svm)
        case SVM_EXIT_INTR:
        case SVM_EXIT_NMI:
        case SVM_EXIT_NPF:
+       case SVM_EXIT_EXCP_BASE + MC_VECTOR:
                return NESTED_EXIT_HOST;
-       case SVM_EXIT_EXCP_BASE ... SVM_EXIT_EXCP_BASE + 0x1f: {
+       case SVM_EXIT_EXCP_BASE + DB_VECTOR:
+       case SVM_EXIT_EXCP_BASE + BP_VECTOR: {
+               /* KVM gets first crack at #DBs and #BPs, if it wants them. */
                u32 excp_bits = 1 << (exit_code - SVM_EXIT_EXCP_BASE);
                if (svm->vmcb01.ptr->control.intercepts[INTERCEPT_EXCEPTION] &
                    excp_bits)
                        return NESTED_EXIT_HOST;
-               else if (exit_code == SVM_EXIT_EXCP_BASE + PF_VECTOR &&
-                        svm->vcpu.arch.apf.host_apf_flags)
-                       /* Trap async PF even if not shadowing */
-                       return NESTED_EXIT_HOST;
                break;
        }
+       case SVM_EXIT_EXCP_BASE + PF_VECTOR:
+               /* Trap async PF even if not shadowing */
+               if (svm->vcpu.arch.apf.host_apf_flags)
+                       return NESTED_EXIT_HOST;
+               break;
        default:
                break;
        }

Reply via email to