> Date: Thu, 26 Jan 2023 10:29:00 -0500 > From: Dave Voutila <d...@sisu.io> > > 1) disable PKU support for guests by masking the cpuid extended feature > bit advertising support (I believe this is already done) and *also* > intercepting and prohibiting flipping the CR4 PKE bit to 1. The > interception of the CR4 bit flip is key. Without it the bit set, > wrpkru/rdpkru generate #UD, so you can prohibit usage. > > I believe the CR4 intercept is done only in nvmm's VMX functionality > and luckily is preventing that bit being set, but I'm not 100% > positive about this. I do know it is *not* done for SVM guests.
I've drafted a patch to filter the CR4 bits for SVM guests. However, I don't have any SVM hardware, so I'll need to find someone to test it. > or > > 2) save and restore the PKRU value at guest entry/exit, similar to FPU > state. > > I think (2) is the right approach and I have a diff for OpenBSD already > that should be landing shortly. For now, as we're testing the use of PKU > in OpenBSD, we have it disabled when we detect we're running under a > hypervisor. That sounds reasonable. If you would like to share the diff I'd be happy to take a look. Need to familiarize myself with the relevant parts of the architecture manual. > We plan on enabling it prior to the release of OpenBSD 7.3 and this will > have ramifications for OpenBSD guests under nvmm potentially trampling > or breaking other guests. I'm happy to have a conversation with any > NetBSD devs looking to update nvmm to support these features. Thanks -- do you have an image we can run in a guest to test the potential trampling?
>From ae344260c6d354438c6f8b08042ac8776875a462 Mon Sep 17 00:00:00 2001 From: Taylor R Campbell <riastr...@netbsd.org> Date: Sat, 28 Jan 2023 13:01:25 +0000 Subject: [PATCH] nvmm: Filter CR4 bits on x86 SVM (AMD). In particular, prohibit PKE, Protection Key Enable, which requires some additional management of CPU state by nvmm. --- sys/dev/nvmm/x86/nvmm_x86_svm.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/sys/dev/nvmm/x86/nvmm_x86_svm.c b/sys/dev/nvmm/x86/nvmm_x86_svm.c index e705f6bbccad..80b82f02c194 100644 --- a/sys/dev/nvmm/x86/nvmm_x86_svm.c +++ b/sys/dev/nvmm/x86/nvmm_x86_svm.c @@ -523,6 +523,33 @@ static uint64_t svm_xcr0_mask __read_mostly; #define CR4_TLB_FLUSH \ (CR4_PSE|CR4_PAE|CR4_PGE|CR4_PCIDE|CR4_SMEP) +#define CR4_VALID \ + (CR4_VME | \ + CR4_PVI | \ + CR4_TSD | \ + CR4_DE | \ + CR4_PSE | \ + CR4_PAE | \ + CR4_MCE | \ + CR4_PGE | \ + CR4_PCE | \ + CR4_OSFXSR | \ + CR4_OSXMMEXCPT | \ + CR4_UMIP | \ + /* CR4_LA57 excluded */ \ + /* bit 13 reserved on AMD */ \ + /* bit 14 reserved on AMD */ \ + /* bit 15 reserved on AMD */ \ + CR4_FSGSBASE | \ + CR4_PCIDE | \ + CR4_OSXSAVE | \ + /* bit 19 reserved on AMD */ \ + CR4_SMEP | \ + CR4_SMAP \ + /* CR4_PKE excluded */ \ + /* CR4_CET excluded */ \ + /* bits 24:63 reserved on AMD */) + /* -------------------------------------------------------------------------- */ struct svm_machdata { @@ -1853,6 +1880,7 @@ svm_vcpu_setstate(struct nvmm_cpu *vcpu) vmcb->state.cr2 = state->crs[NVMM_X64_CR_CR2]; vmcb->state.cr3 = state->crs[NVMM_X64_CR_CR3]; vmcb->state.cr4 = state->crs[NVMM_X64_CR_CR4]; + vmcb->state.cr4 &= CR4_VALID; vmcb->ctrl.v &= ~VMCB_CTRL_V_TPR; vmcb->ctrl.v |= __SHIFTIN(state->crs[NVMM_X64_CR_CR8],