sgx: Enable automatic SVN updates for SGX enclaves

Dave Hansen Wed, 07 May 2025 09:11:28 -0700

On 5/7/25 04:14, Elena Reshetova wrote:
> In case an SGX vulnerability is discovered and TCB recovery
> for SGX is triggered, Intel specifies a process that must be
> followed for a given vulnerability. Steps to mitigate can vary
> based on vulnerability type, affected components, etc.
> In some cases, a vulnerability can be mitigated via a runtime
> recovery flow by shutting down all running SGX enclaves,
> clearing enclave page cache (EPC), applying a microcode patch
> that does not require a reboot (via late microcode loading) and
> restarting all SGX enclaves.


Plain language and brevity have a lot of value in changelogs. There's a
substantial amount of irrelevant content here.

> Even when the above-described runtime recovery flow to mitigate the
> SGX vulnerability is followed, the SGX attestation evidence will
> still reflect the security SVN version being equal to the previous
> state of security SVN (containing vulnerability) that created
> and managed the enclave until the runtime recovery event. This
> limitation currently can be only overcome via a platform reboot,
> which negates all the benefits from the rebootless late microcode
> loading and not required in this case for functional or security
> purposes.

Can this please be trimmed down?

Actually, I think I wrote changelogs for this once upon a time. Could
you please go dig those up and use them?

> SGX architecture introduced a new instruction called ENCLS[EUPDATESVN]
> to Ice Lake [1].

Is it really on "Ice Lake" parts? Like, does it show up on
INTEL_ICELAKE? If not, this is only confusing and mostly irrelevant
information.

> It allows updating security SVN version, given that EPC
> is completely empty. The latter is required for security reasons
> in order to reason that enclave security posture is as secure as the
> security SVN version of the TCB that created it.
> 
> Additionally it is important to ensure that while ENCLS[EUPDATESVN]
> runs, no concurrent page creation happens in EPC, because it might
> result in #GP delivered to the creator. Legacy SW might not be prepared
> to handle such unexpected #GPs and therefore this patch introduces
> a locking mechanism in sgx_(vepc_)open to ensure no concurrent EPC
> allocations can happen.
> 
> Implement ENCLS[EUPDATESVN] and enable kernel to opportunistically
> call it during sgx_(vepc_)open().


> [1]
> https://cdrdv2.intel.com/v1/dl/getContent/648682?explicitVersion=true

These become stale almost immediately. Please also cite the document title.

>  arch/x86/include/asm/sgx.h       | 42 ++++++++++++-------
>  arch/x86/kernel/cpu/sgx/driver.c |  4 ++
>  arch/x86/kernel/cpu/sgx/encl.c   |  1 +
>  arch/x86/kernel/cpu/sgx/encls.h  |  5 +++
>  arch/x86/kernel/cpu/sgx/main.c   | 70 ++++++++++++++++++++++++++++++++
>  arch/x86/kernel/cpu/sgx/sgx.h    |  3 ++
>  arch/x86/kernel/cpu/sgx/virt.c   |  6 +++
>  7 files changed, 116 insertions(+), 15 deletions(-)

Gah, how did this get squished back down to a single patch? It was
multiple patches before. There are multiple logical things going on here
and they need to be broken out.

> diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
> index 6a0069761508..8ac026ef6365 100644
> --- a/arch/x86/include/asm/sgx.h
> +++ b/arch/x86/include/asm/sgx.h
> @@ -27,22 +27,26 @@
>  /* The bitmask for the EPC section type. */
>  #define SGX_CPUID_EPC_MASK   GENMASK(3, 0)
>  
> +/* EUPDATESVN support in CPUID.0x12.0.EAX */
> +#define SGX_CPUID_EUPDATESVN BIT(10)
> +
>  enum sgx_encls_function {
> -     ECREATE = 0x00,
> -     EADD    = 0x01,
> -     EINIT   = 0x02,
> -     EREMOVE = 0x03,
> -     EDGBRD  = 0x04,
> -     EDGBWR  = 0x05,
> -     EEXTEND = 0x06,
> -     ELDU    = 0x08,
> -     EBLOCK  = 0x09,
> -     EPA     = 0x0A,
> -     EWB     = 0x0B,
> -     ETRACK  = 0x0C,
> -     EAUG    = 0x0D,
> -     EMODPR  = 0x0E,
> -     EMODT   = 0x0F,
> +     ECREATE         = 0x00,
> +     EADD            = 0x01,
> +     EINIT           = 0x02,
> +     EREMOVE         = 0x03,
> +     EDGBRD          = 0x04,
> +     EDGBWR          = 0x05,
> +     EEXTEND         = 0x06,
> +     ELDU            = 0x08,
> +     EBLOCK          = 0x09,
> +     EPA                     = 0x0A,
> +     EWB                     = 0x0B,
> +     ETRACK          = 0x0C,
> +     EAUG            = 0x0D,
> +     EMODPR          = 0x0E,
> +     EMODT           = 0x0F,
> +     EUPDATESVN      = 0x18,
>  };
>  
>  /**
> @@ -73,6 +77,11 @@ enum sgx_encls_function {
>   *                           public key does not match IA32_SGXLEPUBKEYHASH.
>   * %SGX_PAGE_NOT_MODIFIABLE: The EPC page cannot be modified because it
>   *                           is in the PENDING or MODIFIED state.
> + * %SGX_INSUFFICIENT_ENTROPY:        Insufficient entropy in RNG.
> + * %SGX_EPC_NOT_READY:                       EPC is not ready for SVN update.
> + * %SGX_NO_UPDATE:           EUPDATESVN was successful, but CPUSVN was not
> + *                           updated because current SVN was not newer than
> + *                           CPUSVN.
>   * %SGX_UNMASKED_EVENT:              An unmasked event, e.g. INTR, was 
> received
>   */
>  enum sgx_return_code {
> @@ -81,6 +90,9 @@ enum sgx_return_code {
>       SGX_CHILD_PRESENT               = 13,
>       SGX_INVALID_EINITTOKEN          = 16,
>       SGX_PAGE_NOT_MODIFIABLE         = 20,
> +     SGX_INSUFFICIENT_ENTROPY        = 29,
> +     SGX_EPC_NOT_READY                       = 30,
> +     SGX_NO_UPDATE                           = 31,
>       SGX_UNMASKED_EVENT              = 128,
>  };

I'd *much* rather that these mechanical constant introductions and
mindless refactoring (like reindenting) not be mixed with actual logic code.

>  
> diff --git a/arch/x86/kernel/cpu/sgx/driver.c 
> b/arch/x86/kernel/cpu/sgx/driver.c
> index 7f8d1e11dbee..669e44d61f9f 100644
> --- a/arch/x86/kernel/cpu/sgx/driver.c
> +++ b/arch/x86/kernel/cpu/sgx/driver.c
> @@ -19,6 +19,10 @@ static int sgx_open(struct inode *inode, struct file *file)
>       struct sgx_encl *encl;
>       int ret;
>  
> +     ret = sgx_inc_usage_count();
> +     if (ret)
> +             return ret;
> +
>       encl = kzalloc(sizeof(*encl), GFP_KERNEL);
>       if (!encl)
>               return -ENOMEM;
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 279148e72459..3b54889ae4a4 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -765,6 +765,7 @@ void sgx_encl_release(struct kref *ref)
>       WARN_ON_ONCE(encl->secs.epc_page);
>  
>       kfree(encl);
> +     sgx_dec_usage_count();
>  }
>  
>  /*
> diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
> index 99004b02e2ed..788acf8ec563 100644
> --- a/arch/x86/kernel/cpu/sgx/encls.h
> +++ b/arch/x86/kernel/cpu/sgx/encls.h
> @@ -233,4 +233,9 @@ static inline int __eaug(struct sgx_pageinfo *pginfo, 
> void *addr)
>       return __encls_2(EAUG, pginfo, addr);
>  }
>  
> +/* Update CPUSVN at runtime. */
> +static inline int __eupdatesvn(void)
> +{
> +     return __encls_ret_1(EUPDATESVN, "");
> +}
>  #endif /* _X86_ENCLS_H */
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 8ce352fc72ac..2872567cd22b 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -15,6 +15,7 @@
>  #include <linux/sysfs.h>
>  #include <linux/vmalloc.h>
>  #include <asm/sgx.h>
> +#include <asm/archrandom.h>
>  #include "driver.h"
>  #include "encl.h"
>  #include "encls.h"
> @@ -914,6 +915,73 @@ int sgx_set_attribute(unsigned long *allowed_attributes,
>  }
>  EXPORT_SYMBOL_GPL(sgx_set_attribute);
>  
> +static bool sgx_has_eupdatesvn;

We have CPUID "caches" of sorts. Why open code this?

> +static atomic_t sgx_usage_count;

Is 32 bits enough for sgx_usage_count? What goes boom when it overflows?

> +static DEFINE_MUTEX(sgx_svn_lock);

What does this lock protect?

> +/**
> + * sgx_updatesvn() - Issue ENCLS[EUPDATESVN]
> + * If EPC is empty, this instruction will update CPUSVN to the currently
> + * loaded microcode update SVN and generate new cryptographic assets.

This is *NOT* EUPDATESVN. "this instruction" is not what's happening here.

sgx_updatesvn() _tries_ to update the SVN. Most of the time, there will
be no update and that's OK. This should only be called with EPC is empty.

> + * Return:
> + * 0: Success or not supported
> + * errno on error

I'm not a big fan of filling thing out just because.

What value is there in saying "errno on error"?

> + */
> +static int sgx_update_svn(void)
> +{
> +     int retry = RDRAND_RETRY_LOOPS;
> +     int ret;
> +
> +     if (!sgx_has_eupdatesvn)
> +             return 0;

This looks goofy. Why is it OK to just silently ignore an update
request? (I know the answer, but it needs to be obvious)

> +     do {
> +             ret = __eupdatesvn();
> +     } while (ret == SGX_INSUFFICIENT_ENTROPY && --retry);


        for (i = 0; i < RDRAND_RETRY_LOOPS; i++) {
                ret = __eupdatesvn();

                /* Stop on success or unexpected errors: */
                if (ret != SGX_INSUFFICIENT_ENTROPY)
                        break;
        }

> +     if (!ret || ret == SGX_NO_UPDATE) {
> +             /*
> +              * SVN successfully updated, or it was already up-to-date.
> +              * Let users know when the update was successful.
> +              */
> +             if (!ret)
> +                     pr_info("SVN updated successfully\n");
> +             return 0;
> +     }

Isn't this a lot simpler?

        if (ret == SGX_NO_UPDATE)
                return 0;

        if (!ret) {
                pr_info("SVN updated successfully\n");
                return 0;
        }

> +     /*
> +      * EUPDATESVN was called when EPC is empty, all other error
> +      * codes are unexcepted except running out of entropy.

                        ^ unexpected

Spell check, please.

> +      */
> +     if (ret != SGX_INSUFFICIENT_ENTROPY)
> +             ENCLS_WARN(ret, "EUPDATESVN");
> +     return ret;
> +}

The indentation here is backwards. The error paths should be indented
and the success path at the lowest indent whenever possible. This, for
example:

        if (ret == SGX_NO_UPDATE)
                return 0;

        if (ret) {
                ENCLS_WARN(ret, "EUPDATESVN");
                return ret;
        }

        pr_info("SVN updated successfully\n");
        return 0;

Oh, and do we expect SGX_INSUFFICIENT_ENTROPY all the time? I thought it
was supposed to be horribly rare. Shouldn't we warn on it?

> +int sgx_inc_usage_count(void)
> +{

No comments, eh?

What does success _mean_? What does failure mean?

> +     int ret;
> +
> +     if (atomic_inc_not_zero(&sgx_usage_count))
> +             return 0;
> +
> +     guard(mutex)(&sgx_svn_lock);
> +
> +     if (atomic_inc_not_zero(&sgx_usage_count))
> +             return 0;
> +
> +     ret = sgx_update_svn();
> +     if (!ret)
> +             atomic_inc(&sgx_usage_count);
> +     return ret;
> +}

Gah, this is 100% *NOT* obvious what's going on. Yet there are zero
comments on it. The lock is uncommented. The atomic is uncommented.

What does any of this mean? What do the components do?

> +
> +void sgx_dec_usage_count(void)
> +{
> +     atomic_dec(&sgx_usage_count);
> +}



>  static int __init sgx_init(void)
>  {
>       int ret;
> @@ -947,6 +1015,8 @@ static int __init sgx_init(void)
>       if (sgx_vepc_init() && ret)
>               goto err_provision;
>  
> +     sgx_has_eupdatesvn = (cpuid_eax(SGX_CPUID) & SGX_CPUID_EUPDATESVN);
> +
>       return 0;
>  
>  err_provision:
> diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
> index d2dad21259a8..f5940393d9bd 100644
> --- a/arch/x86/kernel/cpu/sgx/sgx.h
> +++ b/arch/x86/kernel/cpu/sgx/sgx.h
> @@ -102,6 +102,9 @@ static inline int __init sgx_vepc_init(void)
>  }
>  #endif
>  
> +int sgx_inc_usage_count(void);
> +void sgx_dec_usage_count(void);
> +
>  void sgx_update_lepubkeyhash(u64 *lepubkeyhash);
>  
>  #endif /* _X86_SGX_H */
> diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
> index 7aaa3652e31d..e548de340c2e 100644
> --- a/arch/x86/kernel/cpu/sgx/virt.c
> +++ b/arch/x86/kernel/cpu/sgx/virt.c
> @@ -255,12 +255,18 @@ static int sgx_vepc_release(struct inode *inode, struct 
> file *file)
>       xa_destroy(&vepc->page_array);
>       kfree(vepc);
>  
> +     sgx_dec_usage_count();
>       return 0;
>  }
>  
>  static int sgx_vepc_open(struct inode *inode, struct file *file)
>  {
>       struct sgx_vepc *vepc;
> +     int ret;
> +
> +     ret = sgx_inc_usage_count();
> +     if (ret)
> +             return ret;
>  
>       vepc = kzalloc(sizeof(struct sgx_vepc), GFP_KERNEL);
>       if (!vepc)

I think I'd do this in at least 4 patches:

1. Introduce the usage count tracking: the atomic and the open/release
   "hooks", maybe without error handling on the open() side
2. Introduce the EUPDATESVN mechanical bits: the CPUID bit, the
   enumeration, the bool, the new error enum values
3. Introduce the mechanical eupdatesvn function. The retry loop and the
   "no entropy" handling
4. Plumb #3 into #1

#4 is your place to argue if EUPDATESVN failures should cascade to open().

Re: [PATCH v4 1/1] x86/sgx: Enable automatic SVN updates for SGX enclaves

Reply via email to