On 5/7/25 04:14, Elena Reshetova wrote: > In case an SGX vulnerability is discovered and TCB recovery > for SGX is triggered, Intel specifies a process that must be > followed for a given vulnerability. Steps to mitigate can vary > based on vulnerability type, affected components, etc. > In some cases, a vulnerability can be mitigated via a runtime > recovery flow by shutting down all running SGX enclaves, > clearing enclave page cache (EPC), applying a microcode patch > that does not require a reboot (via late microcode loading) and > restarting all SGX enclaves.
Plain language and brevity have a lot of value in changelogs. There's a substantial amount of irrelevant content here. > Even when the above-described runtime recovery flow to mitigate the > SGX vulnerability is followed, the SGX attestation evidence will > still reflect the security SVN version being equal to the previous > state of security SVN (containing vulnerability) that created > and managed the enclave until the runtime recovery event. This > limitation currently can be only overcome via a platform reboot, > which negates all the benefits from the rebootless late microcode > loading and not required in this case for functional or security > purposes. Can this please be trimmed down? Actually, I think I wrote changelogs for this once upon a time. Could you please go dig those up and use them? > SGX architecture introduced a new instruction called ENCLS[EUPDATESVN] > to Ice Lake [1]. Is it really on "Ice Lake" parts? Like, does it show up on INTEL_ICELAKE? If not, this is only confusing and mostly irrelevant information. > It allows updating security SVN version, given that EPC > is completely empty. The latter is required for security reasons > in order to reason that enclave security posture is as secure as the > security SVN version of the TCB that created it. > > Additionally it is important to ensure that while ENCLS[EUPDATESVN] > runs, no concurrent page creation happens in EPC, because it might > result in #GP delivered to the creator. Legacy SW might not be prepared > to handle such unexpected #GPs and therefore this patch introduces > a locking mechanism in sgx_(vepc_)open to ensure no concurrent EPC > allocations can happen. > > Implement ENCLS[EUPDATESVN] and enable kernel to opportunistically > call it during sgx_(vepc_)open(). > [1] > https://cdrdv2.intel.com/v1/dl/getContent/648682?explicitVersion=true These become stale almost immediately. Please also cite the document title. > arch/x86/include/asm/sgx.h | 42 ++++++++++++------- > arch/x86/kernel/cpu/sgx/driver.c | 4 ++ > arch/x86/kernel/cpu/sgx/encl.c | 1 + > arch/x86/kernel/cpu/sgx/encls.h | 5 +++ > arch/x86/kernel/cpu/sgx/main.c | 70 ++++++++++++++++++++++++++++++++ > arch/x86/kernel/cpu/sgx/sgx.h | 3 ++ > arch/x86/kernel/cpu/sgx/virt.c | 6 +++ > 7 files changed, 116 insertions(+), 15 deletions(-) Gah, how did this get squished back down to a single patch? It was multiple patches before. There are multiple logical things going on here and they need to be broken out. > diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h > index 6a0069761508..8ac026ef6365 100644 > --- a/arch/x86/include/asm/sgx.h > +++ b/arch/x86/include/asm/sgx.h > @@ -27,22 +27,26 @@ > /* The bitmask for the EPC section type. */ > #define SGX_CPUID_EPC_MASK GENMASK(3, 0) > > +/* EUPDATESVN support in CPUID.0x12.0.EAX */ > +#define SGX_CPUID_EUPDATESVN BIT(10) > + > enum sgx_encls_function { > - ECREATE = 0x00, > - EADD = 0x01, > - EINIT = 0x02, > - EREMOVE = 0x03, > - EDGBRD = 0x04, > - EDGBWR = 0x05, > - EEXTEND = 0x06, > - ELDU = 0x08, > - EBLOCK = 0x09, > - EPA = 0x0A, > - EWB = 0x0B, > - ETRACK = 0x0C, > - EAUG = 0x0D, > - EMODPR = 0x0E, > - EMODT = 0x0F, > + ECREATE = 0x00, > + EADD = 0x01, > + EINIT = 0x02, > + EREMOVE = 0x03, > + EDGBRD = 0x04, > + EDGBWR = 0x05, > + EEXTEND = 0x06, > + ELDU = 0x08, > + EBLOCK = 0x09, > + EPA = 0x0A, > + EWB = 0x0B, > + ETRACK = 0x0C, > + EAUG = 0x0D, > + EMODPR = 0x0E, > + EMODT = 0x0F, > + EUPDATESVN = 0x18, > }; > > /** > @@ -73,6 +77,11 @@ enum sgx_encls_function { > * public key does not match IA32_SGXLEPUBKEYHASH. > * %SGX_PAGE_NOT_MODIFIABLE: The EPC page cannot be modified because it > * is in the PENDING or MODIFIED state. > + * %SGX_INSUFFICIENT_ENTROPY: Insufficient entropy in RNG. > + * %SGX_EPC_NOT_READY: EPC is not ready for SVN update. > + * %SGX_NO_UPDATE: EUPDATESVN was successful, but CPUSVN was not > + * updated because current SVN was not newer than > + * CPUSVN. > * %SGX_UNMASKED_EVENT: An unmasked event, e.g. INTR, was > received > */ > enum sgx_return_code { > @@ -81,6 +90,9 @@ enum sgx_return_code { > SGX_CHILD_PRESENT = 13, > SGX_INVALID_EINITTOKEN = 16, > SGX_PAGE_NOT_MODIFIABLE = 20, > + SGX_INSUFFICIENT_ENTROPY = 29, > + SGX_EPC_NOT_READY = 30, > + SGX_NO_UPDATE = 31, > SGX_UNMASKED_EVENT = 128, > }; I'd *much* rather that these mechanical constant introductions and mindless refactoring (like reindenting) not be mixed with actual logic code. > > diff --git a/arch/x86/kernel/cpu/sgx/driver.c > b/arch/x86/kernel/cpu/sgx/driver.c > index 7f8d1e11dbee..669e44d61f9f 100644 > --- a/arch/x86/kernel/cpu/sgx/driver.c > +++ b/arch/x86/kernel/cpu/sgx/driver.c > @@ -19,6 +19,10 @@ static int sgx_open(struct inode *inode, struct file *file) > struct sgx_encl *encl; > int ret; > > + ret = sgx_inc_usage_count(); > + if (ret) > + return ret; > + > encl = kzalloc(sizeof(*encl), GFP_KERNEL); > if (!encl) > return -ENOMEM; > diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c > index 279148e72459..3b54889ae4a4 100644 > --- a/arch/x86/kernel/cpu/sgx/encl.c > +++ b/arch/x86/kernel/cpu/sgx/encl.c > @@ -765,6 +765,7 @@ void sgx_encl_release(struct kref *ref) > WARN_ON_ONCE(encl->secs.epc_page); > > kfree(encl); > + sgx_dec_usage_count(); > } > > /* > diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h > index 99004b02e2ed..788acf8ec563 100644 > --- a/arch/x86/kernel/cpu/sgx/encls.h > +++ b/arch/x86/kernel/cpu/sgx/encls.h > @@ -233,4 +233,9 @@ static inline int __eaug(struct sgx_pageinfo *pginfo, > void *addr) > return __encls_2(EAUG, pginfo, addr); > } > > +/* Update CPUSVN at runtime. */ > +static inline int __eupdatesvn(void) > +{ > + return __encls_ret_1(EUPDATESVN, ""); > +} > #endif /* _X86_ENCLS_H */ > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c > index 8ce352fc72ac..2872567cd22b 100644 > --- a/arch/x86/kernel/cpu/sgx/main.c > +++ b/arch/x86/kernel/cpu/sgx/main.c > @@ -15,6 +15,7 @@ > #include <linux/sysfs.h> > #include <linux/vmalloc.h> > #include <asm/sgx.h> > +#include <asm/archrandom.h> > #include "driver.h" > #include "encl.h" > #include "encls.h" > @@ -914,6 +915,73 @@ int sgx_set_attribute(unsigned long *allowed_attributes, > } > EXPORT_SYMBOL_GPL(sgx_set_attribute); > > +static bool sgx_has_eupdatesvn; We have CPUID "caches" of sorts. Why open code this? > +static atomic_t sgx_usage_count; Is 32 bits enough for sgx_usage_count? What goes boom when it overflows? > +static DEFINE_MUTEX(sgx_svn_lock); What does this lock protect? > +/** > + * sgx_updatesvn() - Issue ENCLS[EUPDATESVN] > + * If EPC is empty, this instruction will update CPUSVN to the currently > + * loaded microcode update SVN and generate new cryptographic assets. This is *NOT* EUPDATESVN. "this instruction" is not what's happening here. sgx_updatesvn() _tries_ to update the SVN. Most of the time, there will be no update and that's OK. This should only be called with EPC is empty. > + * Return: > + * 0: Success or not supported > + * errno on error I'm not a big fan of filling thing out just because. What value is there in saying "errno on error"? > + */ > +static int sgx_update_svn(void) > +{ > + int retry = RDRAND_RETRY_LOOPS; > + int ret; > + > + if (!sgx_has_eupdatesvn) > + return 0; This looks goofy. Why is it OK to just silently ignore an update request? (I know the answer, but it needs to be obvious) > + do { > + ret = __eupdatesvn(); > + } while (ret == SGX_INSUFFICIENT_ENTROPY && --retry); for (i = 0; i < RDRAND_RETRY_LOOPS; i++) { ret = __eupdatesvn(); /* Stop on success or unexpected errors: */ if (ret != SGX_INSUFFICIENT_ENTROPY) break; } > + if (!ret || ret == SGX_NO_UPDATE) { > + /* > + * SVN successfully updated, or it was already up-to-date. > + * Let users know when the update was successful. > + */ > + if (!ret) > + pr_info("SVN updated successfully\n"); > + return 0; > + } Isn't this a lot simpler? if (ret == SGX_NO_UPDATE) return 0; if (!ret) { pr_info("SVN updated successfully\n"); return 0; } > + /* > + * EUPDATESVN was called when EPC is empty, all other error > + * codes are unexcepted except running out of entropy. ^ unexpected Spell check, please. > + */ > + if (ret != SGX_INSUFFICIENT_ENTROPY) > + ENCLS_WARN(ret, "EUPDATESVN"); > + return ret; > +} The indentation here is backwards. The error paths should be indented and the success path at the lowest indent whenever possible. This, for example: if (ret == SGX_NO_UPDATE) return 0; if (ret) { ENCLS_WARN(ret, "EUPDATESVN"); return ret; } pr_info("SVN updated successfully\n"); return 0; Oh, and do we expect SGX_INSUFFICIENT_ENTROPY all the time? I thought it was supposed to be horribly rare. Shouldn't we warn on it? > +int sgx_inc_usage_count(void) > +{ No comments, eh? What does success _mean_? What does failure mean? > + int ret; > + > + if (atomic_inc_not_zero(&sgx_usage_count)) > + return 0; > + > + guard(mutex)(&sgx_svn_lock); > + > + if (atomic_inc_not_zero(&sgx_usage_count)) > + return 0; > + > + ret = sgx_update_svn(); > + if (!ret) > + atomic_inc(&sgx_usage_count); > + return ret; > +} Gah, this is 100% *NOT* obvious what's going on. Yet there are zero comments on it. The lock is uncommented. The atomic is uncommented. What does any of this mean? What do the components do? > + > +void sgx_dec_usage_count(void) > +{ > + atomic_dec(&sgx_usage_count); > +} > static int __init sgx_init(void) > { > int ret; > @@ -947,6 +1015,8 @@ static int __init sgx_init(void) > if (sgx_vepc_init() && ret) > goto err_provision; > > + sgx_has_eupdatesvn = (cpuid_eax(SGX_CPUID) & SGX_CPUID_EUPDATESVN); > + > return 0; > > err_provision: > diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h > index d2dad21259a8..f5940393d9bd 100644 > --- a/arch/x86/kernel/cpu/sgx/sgx.h > +++ b/arch/x86/kernel/cpu/sgx/sgx.h > @@ -102,6 +102,9 @@ static inline int __init sgx_vepc_init(void) > } > #endif > > +int sgx_inc_usage_count(void); > +void sgx_dec_usage_count(void); > + > void sgx_update_lepubkeyhash(u64 *lepubkeyhash); > > #endif /* _X86_SGX_H */ > diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c > index 7aaa3652e31d..e548de340c2e 100644 > --- a/arch/x86/kernel/cpu/sgx/virt.c > +++ b/arch/x86/kernel/cpu/sgx/virt.c > @@ -255,12 +255,18 @@ static int sgx_vepc_release(struct inode *inode, struct > file *file) > xa_destroy(&vepc->page_array); > kfree(vepc); > > + sgx_dec_usage_count(); > return 0; > } > > static int sgx_vepc_open(struct inode *inode, struct file *file) > { > struct sgx_vepc *vepc; > + int ret; > + > + ret = sgx_inc_usage_count(); > + if (ret) > + return ret; > > vepc = kzalloc(sizeof(struct sgx_vepc), GFP_KERNEL); > if (!vepc) I think I'd do this in at least 4 patches: 1. Introduce the usage count tracking: the atomic and the open/release "hooks", maybe without error handling on the open() side 2. Introduce the EUPDATESVN mechanical bits: the CPUID bit, the enumeration, the bool, the new error enum values 3. Introduce the mechanical eupdatesvn function. The retry loop and the "no entropy" handling 4. Plumb #3 into #1 #4 is your place to argue if EUPDATESVN failures should cascade to open().