[PATCH 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows

2024-04-29 Thread Dmitrii Kuvaiskii
e 100% CPU utilization from ksgxd which confirms that swapping happens). Result: 1,000 runs without hangs. (Sorry for the previous copy of this email, accidentally sent to sta...@vger.kernel.org. Failed to use `--suppress-cc` during a test send.) Dmitrii Kuvaiskii (2): x86/sgx: Resolve EAUG

[PATCH 1/2] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS

2024-04-29 Thread Dmitrii Kuvaiskii
86/sgx: Support adding of pages to an initialized enclave") Cc: sta...@vger.kernel.org Reported-by: Marcelina Kościelnicka Suggested-by: Reinette Chatre Signed-off-by: Dmitrii Kuvaiskii --- arch/x86/kernel/cpu/sgx/encl.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a

[PATCH 2/2] x86/sgx: Resolve EREMOVE page vs EAUG page data race

2024-04-29 Thread Dmitrii Kuvaiskii
("x86/sgx: Support complete page removal") Cc: sta...@vger.kernel.org Signed-off-by: Dmitrii Kuvaiskii --- arch/x86/kernel/cpu/sgx/encl.c | 3 ++- arch/x86/kernel/cpu/sgx/encl.h | 3 +++ arch/x86/kernel/cpu/sgx/ioctl.c | 1 + 3 files changed, 6 insertions(+), 1 deletion(-) diff --g

Re: [PATCH 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows

2024-04-30 Thread Dmitrii Kuvaiskii
On Mon, Apr 29, 2024 at 04:06:39PM +0300, Jarkko Sakkinen wrote: > On Mon Apr 29, 2024 at 1:43 PM EEST, Dmitrii Kuvaiskii wrote: > > SGX runtimes such as Gramine may implement EDMM-based lazy allocation of > > enclave pages and may support MADV_DONTNEED semantics [1]. The former

Re: [PATCH 1/2] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS

2024-04-30 Thread Dmitrii Kuvaiskii
On Mon, Apr 29, 2024 at 04:04:24PM +0300, Jarkko Sakkinen wrote: > On Mon Apr 29, 2024 at 1:43 PM EEST, Dmitrii Kuvaiskii wrote: > > Two enclave threads may try to access the same non-present enclave page > > simultaneously (e.g., if the SGX runtime supports lazy allocation). The &

Re: [PATCH 2/2] x86/sgx: Resolve EREMOVE page vs EAUG page data race

2024-04-30 Thread Dmitrii Kuvaiskii
On Mon, Apr 29, 2024 at 04:11:03PM +0300, Jarkko Sakkinen wrote: > On Mon Apr 29, 2024 at 1:43 PM EEST, Dmitrii Kuvaiskii wrote: > > Two enclave threads may try to add and remove the same enclave page > > simultaneously (e.g., if the SGX runtime supports both lazy allocation > &g

[PATCH v2 2/2] x86/sgx: Resolve EREMOVE page vs EAUG page data race

2024-05-15 Thread Dmitrii Kuvaiskii
any memory access to this page results in a normal "allocate and EAUG a page on #PF" flow. Fixes: 9849bb27152c ("x86/sgx: Support complete page removal") Cc: sta...@vger.kernel.org Signed-off-by: Dmitrii Kuvaiskii --- arch/x86/kernel/cpu/sgx/encl.c | 3 ++- arch/x86/kernel/c

[PATCH v2 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows

2024-05-15 Thread Dmitrii Kuvaiskii
ult: 1,000 runs without hangs. [1] https://github.com/gramineproject/gramine/pull/1513 v1 -> v2: - No changes in code itself - Expanded cover letter - Added CPU1 vs CPU2 race scenarios in commit messages v1: https://lore.kernel.org/all/20240429104330.3636113-3-dmitrii.kuvais...@intel.com/ Dmit

[PATCH v2 1/2] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS

2024-05-15 Thread Dmitrii Kuvaiskii
e_epc_page() with sgx_free_epc_page(). Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave") Cc: sta...@vger.kernel.org Reported-by: Marcelina Kościelnicka Suggested-by: Reinette Chatre Signed-off-by: Dmitrii Kuvaiskii --- arch/x86/kernel/cpu/sgx/encl.c | 7 ++

[PATCH v3 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows

2024-05-17 Thread Dmitrii Kuvaiskii
ais...@intel.com/ Dmitrii Kuvaiskii (2): x86/sgx: Resolve EAUG race where losing thread returns SIGBUS x86/sgx: Resolve EREMOVE page vs EAUG page data race arch/x86/kernel/cpu/sgx/encl.c | 10 +++--- arch/x86/kernel/cpu/sgx/encl.h | 3 +++ arch/x86/kernel/cpu/sgx/ioctl.c | 1 + 3 f

[PATCH v3 1/2] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS

2024-05-17 Thread Dmitrii Kuvaiskii
ialized enclave") Cc: sta...@vger.kernel.org Reported-by: Marcelina Kościelnicka Suggested-by: Reinette Chatre Signed-off-by: Dmitrii Kuvaiskii Reviewed-by: Haitao Huang Reviewed-by: Jarkko Sakkinen Reviewed-by: Reinette Chatre --- arch/x86/kernel/cpu/sgx/encl.c | 7 +-- 1 file c

[PATCH v3 2/2] x86/sgx: Resolve EREMOVE page vs EAUG page data race

2024-05-17 Thread Dmitrii Kuvaiskii
any memory access to this page results in a normal "allocate and EAUG a page on #PF" flow. Fixes: 9849bb27152c ("x86/sgx: Support complete page removal") Cc: sta...@vger.kernel.org Signed-off-by: Dmitrii Kuvaiskii Reviewed-by: Haitao Huang Reviewed-by: Jarkko Sakkinen Acke

Re: [PATCH v3 2/2] x86/sgx: Resolve EREMOVE page vs EAUG page data race

2024-06-07 Thread Dmitrii Kuvaiskii
On Tue, May 28, 2024 at 09:23:13AM -0700, Dave Hansen wrote: > On 5/17/24 04:06, Dmitrii Kuvaiskii wrote: > ... > > First, why is SGX so special here? How is the SGX problem different > than what the core mm code does? Here is my understanding why SGX is so special and why I have

Re: [PATCH v3 0/2] x86/sgx: Fix two data races in EAUG/EREMOVE flows

2024-06-07 Thread Dmitrii Kuvaiskii
On Tue, May 28, 2024 at 09:01:10AM -0700, Dave Hansen wrote: > On 5/17/24 04:06, Dmitrii Kuvaiskii wrote: > > We wrote a trivial stress test to reproduce the hangs observed in > > real-world applications. The test stresses #PF-based page allocation and > > SGX_IOC_ENCLAVE_REMO

[PATCH v4 2/3] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS

2024-07-05 Thread Dmitrii Kuvaiskii
ialized enclave") Cc: sta...@vger.kernel.org Reported-by: Marcelina Kościelnicka Suggested-by: Reinette Chatre Signed-off-by: Dmitrii Kuvaiskii Reviewed-by: Haitao Huang Reviewed-by: Jarkko Sakkinen Reviewed-by: Reinette Chatre --- arch/x86/kernel/cpu/sgx/encl.c | 7 +-- 1 file c

[PATCH v4 1/3] x86/sgx: Split SGX_ENCL_PAGE_BEING_RECLAIMED into two flags

2024-07-05 Thread Dmitrii Kuvaiskii
clave page is being removed; this new case will set only the SGX_ENCL_PAGE_BUSY flag. Cc: sta...@vger.kernel.org Signed-off-by: Dmitrii Kuvaiskii --- arch/x86/kernel/cpu/sgx/encl.c | 16 +++- arch/x86/kernel/cpu/sgx/encl.h | 10 -- arch/x86/kernel/cpu/sgx/main.c | 4 ++-- 3 fil

[PATCH v4 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race

2024-07-05 Thread Dmitrii Kuvaiskii
nd EAUG a page on #PF" flow. Fixes: 9849bb27152c ("x86/sgx: Support complete page removal") Cc: sta...@vger.kernel.org Signed-off-by: Dmitrii Kuvaiskii --- arch/x86/kernel/cpu/sgx/ioctl.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/a

[PATCH v4 0/3] x86/sgx: Fix two data races in EAUG/EREMOVE flows

2024-07-05 Thread Dmitrii Kuvaiskii
4862-b146-dd57711c8...@intel.com/ v1: https://lore.kernel.org/all/20240429104330.3636113-3-dmitrii.kuvais...@intel.com/ v2: https://lore.kernel.org/all/20240515131240.1304824-1-dmitrii.kuvais...@intel.com/ v3: https://lore.kernel.org/all/20240517110631.3441817-1-dmitrii.kuvais...@intel.com/ D

Re: [PATCH v4 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race

2024-08-09 Thread Dmitrii Kuvaiskii
ps. TLDR: I can add similar handling to sgx_enclave_modify_types() if reviewers insist, but I don't see how this data race can ever be triggered by benign real-world SGX applications. -- Dmitrii Kuvaiskii

Re: [PATCH v4 1/3] x86/sgx: Split SGX_ENCL_PAGE_BEING_RECLAIMED into two flags

2024-08-12 Thread Dmitrii Kuvaiskii
On Wed, Jul 17, 2024 at 01:36:08PM +0300, Jarkko Sakkinen wrote: > On Fri Jul 5, 2024 at 10:45 AM EEST, Dmitrii Kuvaiskii wrote: > > SGX_ENCL_PAGE_BEING_RECLAIMED flag is set when the enclave page is being > > reclaimed (moved to the backing store). This flag however has two >

Re: [PATCH v4 1/3] x86/sgx: Split SGX_ENCL_PAGE_BEING_RECLAIMED into two flags

2024-08-12 Thread Dmitrii Kuvaiskii
On Wed, Jul 17, 2024 at 01:37:39PM +0300, Jarkko Sakkinen wrote: > On Fri Jul 5, 2024 at 10:45 AM EEST, Dmitrii Kuvaiskii wrote: > > +/* > > + * 'desc' bit indicating that PCMD page associated with the enclave page is > > + * busy (e.g. because the

Re: [PATCH v4 2/3] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS

2024-08-12 Thread Dmitrii Kuvaiskii
ode, and then to fix in a cleaner way? I thought that was the point of previous Dave Hansen's comments [2]. [1] https://lore.kernel.org/all/20240705074524.443713-2-dmitrii.kuvais...@intel.com/ [2] https://lore.kernel.org/all/1d405428-3847-4862-b146-dd57711c8...@intel.com/ -- Dmitrii Kuvaiskii

Re: [PATCH v4 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race

2024-08-12 Thread Dmitrii Kuvaiskii
128.3084051-1-dmitrii.kuvais...@intel.com/ -- Dmitrii Kuvaiskii

Re: [PATCH v4 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race

2024-08-12 Thread Dmitrii Kuvaiskii
modify_types() in the next iteration of this patch series. -- Dmitrii Kuvaiskii

[PATCH v5 0/3] x86/sgx: Fix two data races in EAUG/EREMOVE flows

2024-08-21 Thread Dmitrii Kuvaiskii
36113-3-dmitrii.kuvais...@intel.com/ v2: https://lore.kernel.org/all/20240515131240.1304824-1-dmitrii.kuvais...@intel.com/ v3: https://lore.kernel.org/all/20240517110631.3441817-1-dmitrii.kuvais...@intel.com/ v4: https://lore.kernel.org/all/20240705074524.443713-1-dmitrii.kuvais...@intel.com/

[PATCH v5 1/3] x86/sgx: Split SGX_ENCL_PAGE_BEING_RECLAIMED into two flags

2024-08-21 Thread Dmitrii Kuvaiskii
he enclave page is being reclaimed (by the page reclaimer thread). A future commit will introduce new cases when the enclave page is being operated on; these new cases will set only the SGX_ENCL_PAGE_BUSY flag. Cc: sta...@vger.kernel.org Signed-off-by: Dmitrii Kuvaiskii Reviewed-by: Haitao Huang Ack

[PATCH v5 2/3] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS

2024-08-21 Thread Dmitrii Kuvaiskii
lock(&encl->lock); /* * *BUG*: SIGBUS is returned * for a valid enclave page */ return VM_FAULT_SIGBUS; } } Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave") Cc: sta...@vger.kernel.org Reported-by: Marcelina Koście

[PATCH v5 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race

2024-08-21 Thread Dmitrii Kuvaiskii
race would indicate a bug in a user space application), but it gives a consistent rule: if an enclave page is under certain operation by the kernel with the mapping removed, then other threads trying to access that page are temporarily blocked and should retry. Fixes: 9849bb27152c ("x86

[PATCH v6 1/3] x86/sgx: Split SGX_ENCL_PAGE_BEING_RECLAIMED into two flags

2024-09-24 Thread Dmitrii Kuvaiskii
he enclave page is being reclaimed (by the page reclaimer thread). A future commit will introduce new cases when the enclave page is being operated on; these new cases will set only the SGX_ENCL_PAGE_BUSY flag. Cc: sta...@vger.kernel.org Signed-off-by: Dmitrii Kuvaiskii Reviewed-by: Haitao Huang R

[PATCH v6 0/3] x86/sgx: Fix two data races in EAUG/EREMOVE flows

2024-09-24 Thread Dmitrii Kuvaiskii
/lore.kernel.org/all/20240821100215.4119457-1-dmitrii.kuvais...@intel.com/ Dmitrii Kuvaiskii (3): x86/sgx: Split SGX_ENCL_PAGE_BEING_RECLAIMED into two flags x86/sgx: Resolve EAUG race where losing thread returns SIGBUS x86/sgx: Resolve EREMOVE page vs EAUG page dat

[PATCH v6 2/3] x86/sgx: Resolve EAUG race where losing thread returns SIGBUS

2024-09-24 Thread Dmitrii Kuvaiskii
lock(&encl->lock); /* * *BUG*: SIGBUS is returned * for a valid enclave page */ return VM_FAULT_SIGBUS; } } Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave") Cc: sta...@vger.kernel.org Reported-by: Marcelina Kościelnicka Sugg

[PATCH v6 3/3] x86/sgx: Resolve EREMOVE page vs EAUG page data race

2024-09-24 Thread Dmitrii Kuvaiskii
: Support complete page removal") Fixes: 45d546b8c109d ("x86/sgx: Support modifying SGX page type") Cc: sta...@vger.kernel.org Reviewed-by: Kai Huang Reviewed-by: Jarkko Sakkinen Signed-off-by: Dmitrii Kuvaiskii --- arch/x86/kernel/cpu/sgx/encl.h | 3 ++- arch/x86/kernel/cpu/s